Mucho going on in View (and more generally VDI) land. My first part I was posted here.
If you’re interested in a quick catch-up, read on…
View 4.5 beta
The existence of this has been discussed by others (here, and here) – I will neither confirm nor deny. What I can say is that the ongoing march of improved simplicity, scale, function in the hosted virtual desktop use case is well underway, and that every day, more and more customers are starting to embrace it.
I’m part of the internal EMC View 4 pilot rollout. For me personally, Windows 7 and check-in/out is a huge deal (and neither supported in View 4). Then again, I wouldn’t describe me as the idea target user (at least right now) for Client Virtualization (as an extremely mobile laptop user, who is also constantly building my own environment – I’m using Office 2010 beta also right now as an example).
That said – expanding the use cases out to these types of users are very important.
I do use it all the time – but not as my primary machine. How do I use it? The vSpecialist team uses View as the front-end to much of our demo lab gear.
If you want to hear about our own experiences with View 4 in our internal rollout – their was a recent EMC IT webcast which was recorded. You can see that here.
VMware View Launch Tour
The VMware View/EMC Launch tour is coming to a city near you… I should have posted this earlier, as there were dates I missed, but there are several left in Canada and the US. You can register here.
The “who’s better for View vendor histrionics” continue…
At Partner Exchange – my colleagues from NetApp claimed that “9 of 10 customers using VDI use NetApp” . As you can imagine, that caused some eyebrows to be raised. Vaughn reiterated that claim recently at Mike Laverick’s blog/”chinwag” here… (35 minutes in) as well as making some comments about me (calling me “Rupert Murdoch” for hiring good folks, and EMC the “Evil Galactic Empire” about 10 minutes in :-)
BTW, correcting one comment in the recording – as EMC, I can’t hire from partners who are focused and committed EMC partners. But I will also say this – people are the thing that make the world go-round, and I’m certainly OK about taking top-notch people from competitors and competitive partners!
Vaughn is good guy, and you can hear that in Mike’s podcast (which I would recommend listening to). He and I are both passionate about virtualization, and love our gigs. I personally agree with almost everything he said, particularly around “stack”… and disagree vehemently with the characterizations about me and EMC, View market share, and implying that the VCE Coalition (integrated tech, selling, support, and joint venture) and the partnership between Cisco and NetApp are equivalent. but then again, you’d probably expect that :-).
First things first… there is no data (none) to support the NetApp claim. I was going to let it go and not respond publicly, but it keeps getting broad up, so…
While I think NetApp has great technology, and a strong View go-to-market, this claim didn’t sound right to me, so I pinged the View product team. Their comments:
"based on our VDI run rate and overall VDI market size, also supported by what we are hearing from our software partners, we don't see how this can possibly be true."
On a similar thread, I also ran into several customers in Southern California who were told by NetApp that “VMware runs their internal production View deployment on NetApp”. This is also not correct. If anyone would like to the VMware IT folks about what they deploy internally, I’m happy to arrange that discussion.
NetApp has solid solutions and is a fine company – I don’t think they need to do these things. I suppose in some way it’s effective, as it forces me/us to spend time correcting things.
IO scaling in the VDI use case
More on this front… The question of IO scaling point is very real, particularly at larger scales (thousands of clients) and came up at the two biggest financial customers in NYC who I visited last week
To recap: "while the focus on VDI storage costs tends to focus on $/GB (as people are not knowledgeable about storage in general think in “GB”), the question of cost of configuration that supports the IO load through the lifecycle of the client community is often governed by many factors – including BOTH capacity and performance as well as functional use cases”.
Vaughn’s posted on his blog here some comments about the VMware Express van (the van itself is very cool – I would highly recommend checking it out if you can). I think it was interesting to see the comments in response to the thread.
Personally, I was blown away by the claim Vaughn initially made about “12U, two disk shelves, 5000 users”
So, just like any extreme claim (like the incorrect statements in the section above above), I did some digging. I happen to know the fellow in the VMware GETO team he was talking to, and I asked him. His comment: “we currently have enough *compute* capacity to run around 5000 desktops”. He also took umbrage at being taken so out of context, but I won’t post the rest of his comments as they weren’t nice :-)
I did also personally followup with Vaughn – as a statement implies that 28 disks could support 5000 VDI seemed… “off”.
To understand why they seemed “off”, quick back of napkin math assuming the users drive 10 IOps on average, and assuming a 50/50 read/write mix, that’s 25,000 random read and 25,000 random write IOs per second coming into the storage array. Assuming no magic (and there is some magic) between the host and the back-end disks (this is the most conservative assumption) – this would mean each of those disk drives is able to do 1785 random IOps :-) This is about 10x the amount you would expect them to do. Array magic can make the backend do more than it should be able to do with no magic.
So, let’s talk about array magic.
All arrays have technologies that apply in varying cases to the above.
After the outreach about the number of disks, Vaughn asked some of his technical folks and they ran it through the NetApp sizing tool for VDI. His correction on the post (thank you, Vaughn – and BTW, there have been times where Vaughn has corrected me – like he found an error in our VM alignment docs that got fixed) is that for a 4IOps/user, 50/50 (read/write mix), the configuration would need 56 spindles, not 24. I’m assuming these were 15K drives.
I will note that 4IOps per user is much lower than I’m seeing in practice. In practice, I tend to see 8-15 per user, and it’s not unusual to see 25 at peak…. But – let’s continue with an assumption of 4. If it were larger, the number of drives just scales up.
This makes more sense, as that number translates to 357 IOps per drive (4 per user, 5000 users, divided into 56 drives). This is something to be proud of because absent “array magic” max for a 15K drive would be around 180 IOps – so if they can get 360 “effective IOps” that’s a 2x improvement. Of course, I want to note that the calculations involve no RAID loss considerations.
If a vendor (NetApp or EMC – we do seem to be the ones most focused on the VDI use cases) says they can do a 2x “array magic” reduction in IO in their processors that sit between the host and the spindles, lean forward and listen. Poke at it a lot – because 2x is still a big number, but listen. If they claim 10x “array magic”, walk away slowly – they are dangerous to you as the customer, and themselves as the vendor :-)
Here’s my advice:
There’s a certain break-point where configurations tend to be capacity-gated vs. performance gated.
Then, go through and look at the cost-to-serve a desktop with that given configuration.
When you go through that process, I’m confident that EMC’s VDI solutions can prove competitive cost-to-serve-a-client (that and functional use cases, and availability requirements). I know because I (and my team) do it every day.
BTW – the way (IMO) to achieve order-of-magnitude capex improvements are to:
Then, move on to “tens of percentages” optimizations.
Remember – whether it’s VMDK block level dedupe (which regularly shows 90% capacity savings), or VMDK compression (which shows 40-60% on top of thin), the question isn’t how much capacity is saved, but rather how much the total configuration costs, and the $/VM, physical space/VM and so on..) You should challenge every vendor (this certainly applies to EMC) to express the solution in those types of metrics. Not in any given feature.
Here’s data from lab analysis of the effect of the base replica being on EFD…
Here’s data from a customer (in this case a Citrix case of user.dat files where behavior is governed by CIFS operations per second, but applies similarly to the View use case). Here we have an NS-960 with 5 EFDs, compared with the older NS-80s with 20 15K drives.
Not only were the NS-960s able to sustain 80K CIFS Op/Sec (which is a LOT) - the EFDs were able to do 33x more random IO per backend spindle, and do it with a 1.5ms response time rather than 93ms response time at the filesystem level observed on traditional 15K RPM disks.
I hate to be so pedantic about this topic (VDI storage design), but it’s important to me for the reason I mentioned back in this post. This IO density question is the number 3 reason I see View projects not starting (#2 being total TCO; and #1 being client experience), and #1 reason why they go sideways in late stage scale-up. Bunch of interesting startups here – Atlantis and others….
These two blog posts discuss this in good detail and additional perspectives:
There was some question about whether or not SimDK includes the hidden/internal vSphere4 APIs. The answer is not just yes, but a resounding yes! The reason I am so enthusiastic about the subject is that while SimDK’s WSDL model does expose the vSphere4 API’s internal service content, it also makes it possible to use the SimDK WSDL to communicate with an actual vSphere4 server and access these methods.
I’ve created two new classes in the SimDK stubs module that show how to do this:
As you can see, not only does SimDK provide simulation, but it can be used as a platform for accessing all of the features of a real vSphere4 server that have remained out of easy reach until now!
The SimDK WSDL files are automatically generated and therefore not stored in the source repository, but you can access them here:

In yesterday's post we discussed how VI admins are becoming storage admins as an unexpected byproduct from the virtualization of our data centers. We looked at the storage provisioning process with emphasis on the number tasks that a VI admin must execute. At reviewing the traditional model we introduced the new model; one jointly created by NetApp and VMware engineering and designed to simplify, automate, and standardize storage operations in the cloud.
The last item we discussed was the how the new model allows storage admins to provision and secure 'raw' storage resource pool for use by the VI admin team. This model allows the storage team to focus on higher value functions such as resource utilization, quality of service, and data protection.
In today’s post our discussion shifts to the storage management integrations available to the VI admins thru the NetApp vCenter plug-ins.
As you probably know, NetApp array run Data ONTAP across all of our storage controllers, including when we virutalize other arrays. There a tremendous amount of value in unified architectures like Data ONTAP, ESX/ESXI, and NX-OS as the architecture allows for vendors, partners, 3rd party developers, service providers, and customers to develop a set of tools, which work across the data center regardless of hardware platform.
The NetApp plug-ins for vCenter leverage the unified architecture in order to deliver all functionality across every array and for every storage protocol. Imagine deploying processes across all of your datacenters without having to consider the hardware model?
In the rare event a function is unavailable, it tends to be due to a lack of support for a particular operation within the entire architecture stack.
Audit and Automate ESX/ESXi Storage Settings
I felt that a good place to start was with basic connectivity. By selecting the NetApp tab in vCenter the VI admin is shown the storage arrays configured in part 1 of this series. The storage arrays are able to identify the ESX/ESXi hosts connected to them and thru vCenter audit the settings related to FC, FCoE, iSCSI, and NFS to see if they are set to values defined in NetApp best practices. Should theses settings require updating, the VI Admin can select an individual or multiple hosts and execute a non-disruptive update to the storage settings.
(click on the image to view at full size)
The audit process can be run at any time without disruption to the production environment. This capability empowers VI admins with the ability to ensure optimal uptime as the environment grows. The settings we update are currently not covered with VMware host profiles, or in other words we are extending the automated configuration process. Finally, the ability to update the host setting is limited to vSphere hosts. From what I understand, VI3 hosts do not provide the APIs in order for us to make our changes; however, once a system is identified as being in non-compliance one can make the manual settings as outline in TR-3428.
Report Storage Details and Utilization
Also available in the NetApp tab is the ability to report on storage utilization for SAN & NAS based datastores. When one selects a datastore they are presented with a large amount of details around the storage object such as LUN serial number, igroup, ALUA enablement, dedupe savings, etc.
One of the key benefits in this screen is the ability to report on storage utilization through the numerous layers beginning with at the datastore and ending at the aggregate (or a physical collection of RAID protected disks).
The value here really stands out when one enables storage savings technologies like our block level data deduplication, zero cost clones, or thin provisioning.
(click on the image to view at full size)
Report Storage Faults
The NetApp plug-in also provides feedback as to the health of the storage controllers to the VI Admins. The ability to report the 'health' of the physical infrastructure allows us to reduce the time it may take when the VI admins and storage admin may need to address an issue.
(click on the image to view at full size)
Ensure Optimal I/O Settings within VMs
Another component to our plug-in is the ability to audit and adjust settings within the VM in order to ensure optimal I/O. The first set of tools includes scripts which can be ran from within VMs or applied to VM templates, which sets local SCSI settings within the GOS.
The second set of tools is our MBRscan & MBRalign. The two tools combine to audit and correct the partitions and file systems within the VM. I've covered the importance of partition alignment. If you're unfamiliar with this topic, I'd suggest you get acquainted, as it is key to ensuring optimal VMs especially if one leverages the cloud and runs VMs across a dissimilar set of storage arrays.
(click on the image to view at full size)
Provisioning Datastores from Resource Pools
Do you remember where we left off in Part 1 of this series? It was with the storage admin provisioning pools of storage resources for use by the VI Admin. Do you recall that the storage admin did not have to create LUNs, FlexVols, LUN masks, NFS exports, set multipathing policies, etc...? We've saved all of these nice details for our plug-ins.
With NetApp one can select an ESX/ESXi host, cluster, or data center and provision a datastore to that unit. The datastore can be FC, FCoE, iSCSI, or NFS and will be configured from one of the resource pools established by the storage admin. Our plug-in will handle path selection, load balancing, setting multi-pathing policies, securing the storage target, and the enabling of thin provisioning and data dedupe. While the VI admin receives automation and an on-demand provisioning process, the environment receives the consistent implementation of NetApp best practices.
(click on the image to view at full size)
I trust you as the reader understand the number of different options one must consider when deploying SAN or NAS with VI3 and vSphere. Suffice to say, each protocol is different with each major release of the hypervisor.
(click on the image to view at full size)
Managing and Reconfiguring Datastores
Ever provision storage and later wish you could modify it? Maybe a datastore is nearing its capacity limit. Should you initiate a migration with Storage VMotion? While I don't know if you should or shouldn’t with NetApp you have additional choices in the areas of dynamic, non-disruptive resizing of datastores (sorry only NFS datastore can be shrunk).
To change the capacity or storage efficiency settings one needs to simply select the datastore, right click, and choose appropriate option from within vCenter.
(click on the image to view at full size)
Instant Provisioning of Pre-Deduplicated VMs as Servers and Desktops
The NetApp plug-ins began when a number of us at NetApp and VMware looking at ways to reduce storage costs with virtual desktop deployments. This demo from VMworld 2007, is the earliest public content displaying of our zero-cost cloning technology FlexClone cloning files. Until then FlexClone was limited to LUNs and FlexVols.
<object height="304" width="380"><param name="movie" value="http://www.youtube.com/v/7Miv0PiJFzM&hl=en_US&fs=1&"><param name="allowFullScreen" value="true"><param name="allowscriptaccess" value="always"><embed allowfullscreen="true" allowscriptaccess="always" height="304" src="http://www.youtube.com/v/7Miv0PiJFzM&hl=en_US&fs=1&" type="application/x-shockwave-flash" width="380"></embed></object>Today we can deploy a single VM, multiple VMs or an entire desktop pool (across multiple datastores) instantly, without consuming any additional storage in the process. By simplifying selecting a running or shutdown VM, template, or vApp the VI admin can deploy the industries most space efficient clones.
(click on the image to view at full size)
Unlike some cloning technologies, NetApp clones are permanent, high performance VMs that can be handled in the same manner as any VM. There are no restrictions with are cloning.
Our cloning is integrated into vCenter, View Manager, XenDesktop, and Quest vWorkspace. If our clones didn't work so well, you wouldn’t find them so tightly integrated across products from the industries leading vendors of virtual desktop solutions.
There's a number of additional features available when cloning VMs with our plug-in, but those details are probably better suited for a future post. (hint - think hardware assisted desktop refresh)
Wrapping Up Part 2
I hope you have found the details around our (did I mention free) vCenter plug-ins. I'd love to hear what you think. Please feel free to post comments and suggestions for future updates.
I have one additional post planned in this series where we take a look at the plug-ins available by the rest of the storage industry. I'd like to think of part 3 as an audit of the storage industry's response to the challenge I made following VMworld 2009.
It's late, and I ned to wrap up. I hope you'll be back for tomorrow's conclusion.
To be honest, I am surprised that I have not blogged about this before, but today I would like to talk about how virtual machine files are placed on the hard disk.
Virtual Machine files
The first thing to know is what files are used to create a virtual machine:
Understanding data roots
Hyper-V has a concept of the “virtual machine data root” and the “virtual machine snapshot root”. These are the locations where the virtual machine configuration (.XML) and saved state (.BIN & .VSV) files are stored. For example – a virtual machine which had a virtual machine data root of “D:\Foo” and a snapshot data root of “D:\Foo” and had two snapshots would have a file structure like this:
D:\Foo
D:\Foo\Snapshots
D:\Foo\Snapshots\[Snapshot #1 GUID directory]
D:\Foo\Snapshots\[Snapshot #1 GUID].XML
D:\Foo\Snapshots\[Snapshot #2 GUID directory]
D:\Foo\Snapshots\[Snapshot #2 GUID].XML
D:\Foo\Virtual Machines
D:\Foo\Virtual Machines\[Virtual Machine GUID directory]
D:\Foo\Virtual Machines\[Virtual Machine GUID].XML
If the snapshots and the virtual machine had saved states associated with them – then the file structure would look like this:
D:\Foo
D:\Foo\Snapshots
D:\Foo\Snapshots\[Snapshot #1 GUID directory]
D:\Foo\Snapshots\[Snapshot #1 GUID directory]\[Snapshot #1 GUID].BIN
D:\Foo\Snapshots\[Snapshot #1 GUID directory]\[Snapshot #1 GUID].VSV
D:\Foo\Snapshots\[Snapshot #1 GUID].XML
D:\Foo\Snapshots\[Snapshot #2 GUID directory]
D:\Foo\Snapshots\[Snapshot #2 GUID directory]\[Snapshot #1 GUID].BIN
D:\Foo\Snapshots\[Snapshot #2 GUID directory]\[Snapshot #1 GUID].VSV
D:\Foo\Snapshots\[Snapshot #2 GUID].XML
D:\Foo\Virtual Machines
D:\Foo\Virtual Machines\[Virtual Machine GUID directory]
D:\Foo\Virtual Machines\[Virtual Machine GUID directory]\[Virtual Machine GUID].BIN
D:\Foo\Virtual Machines\[Virtual Machine GUID directory]\[Virtual Machine GUID].VSV
D:\Foo\Virtual Machines\[Virtual Machine GUID].XML
Some key things to highlight about data roots:
Understanding VHD and AVHD locations
.VHD files can be created pretty much anywhere you want. In Windows Server 2008 R2, .AVHD files are always created in the same location as their parent .VHD files.
Common Virtual Machine File Configuration #1 – Default Virtual Machine Data Root
A virtual machine with a default virtual machine data root is one where you created the virtual machine and accepted the default options in the new virtual machine wizard, specifically where you did not check to “Store the virtual machine in a different location” on the first page of the new virtual machine wizard:
In this configuration option the virtual machine data root and snapshot data root will be set to the path specified under the Hyper-V Settings in the “Virtual Machines” setting, and the virtual hard disk will be created under the path specified under the Hyper-V Settings in the “Virtual Hard Disks” setting:
These paths are normally set to “C:\ProgramData\Microsoft\Windows\Hyper-V” for the “Virtual Machines” setting and “C:\Users\Public\Documents\Hyper-V\Virtual Hard Disks” for the “Virtual Hard Disks” setting. That said – I usually change these settings to “D:\Hyper-V\Configuration Files” and “D:\Hyper-V\Virtual Hard Disks” on my systems as I find this easier to work with.
Common Virtual Machine File Configuration #2 – External Virtual Machine Data Root
If you do select to “Store the virtual machine in a different location” you will get what we call a virtual machine with an external virtual machine data root.
With this option we create a new folder named after the virtual machine, and set the virtual machine data root and snapshot data root to this folder. We also default to creating the virtual hard disk in this new folder.
Common Virtual Machine File Configuration #3 – Exported / Imported virtual machine
If you export a virtual machine a virtual machine and then import it without checking the option to “Duplicate all files so the same virtual machine can be imported again”, you will end up with a virtual machine that looks like a virtual machine with an external data root – but there will be one difference.
Instead of having the virtual hard disks stored in the same location as the virtual machine data root – they will be stored in a “Virtual Hard Disks” folder under the virtual machine data root folder instead.
Changing a virtual machine to a default data root virtual machine
If you have an existing virtual machine that you want to change to a “default data root” configuration – the easiest way to do this is to export the virtual machine and then import it and check the option to “Duplicate all files so the same virtual machine can be imported again”. The resulting virtual machine will be a default data root virtual machine.
Changing a virtual machine to an external data root virtual machine
If you have an existing virtual machine that you want to change to an “external data root” configuration, you have two options:
Changing the snapshot data root for a virtual machine
The only way to change the virtual machine data root for a virtual machine is by using import / export. But the snapshot data root for a virtual machine can be changed at any time – as long as all snapshots are deleted first. If you have deleted all existing snapshots you can change the snapshot data root by changing the “Snapshot File Location” setting for the virtual machine under the virtual machine settings user interface.
And that is pretty much all there is to know about where virtual machine files are stored today :-)
Cheers,
Ben
A few years ago, after writing my whitepaper on VMware Infrastructure (VI) plug-ins, I was fortunate enough to visit the VMware campus and meet several of their engineers. I mentioned to them that it would be nice to have a VMware Infrastructure simulator to develop against for when a full VMware installation was not available. They said that while that may be nice, it was a lot of work, and wouldn’t installing an ESX server be a better idea?
Skip ahead two years when I joined Hyper9. I mentioned to some of the other employees that I created a mini-simulator while working on my VMM project. They said it sure would be nice to have a full VMware simulator.
Rewind your clock to last fall. David Marshall and Dave McCrory (CTO of Hyper9) mentioned that they had been thinking it would be cool to create a VMware simulator, and wouldn’t you know it, Dave McCrory, after some poking around, discovered a very special JAR file that turned out to be the equivalent of our very own flux capacitor.
It is March 2010 and I am pleased as punch to announce the immediate availability of the open source (BSD) project from Hyper9, SimDK, a VMware vSphere4 simulator which provides vSphere4 API-compatibility for official vSphere4 clients and other applications built using the vSphere4 SDK. SimDK represents months of hard work and is nothing less than an API-compatible recreation of the vSphere SDK.
.
The following video presents a brief overview of SimDK.
And this video is a demonstration of connecting to SimDK with the vSphere4 PowerCLI.
In a standard vSphere deployment, the vSphere4 clients, such as the vSphere PowerCLI (PowerShell), the vSphere Client, and other toolkits access vSphere4 through the vSphere4 SDK.
The SDK is responsible for handling requests for creating virtual machines (VMs) and issuing vMotion commands. And of course the SDK is also in charge of providing the clients with responses to their requests.
SimDK is able to simulate a vSphere4 environment by replacing the vSphere API/SDK web service with the SimDK web service.
The SimDK web service handles requests from vSphere4 clients and instead of communicating with a vCenter database or an ESX server, the requests are handled by the SimDK simulator. The data is persisted in SimDK’s own database tables and the responses are serialized and sent back to the clients.
SimDK is also able to proxy other hypervisors by emulating the vSphere4 API to the vSphere4 clients.
For example, when a vSphere client connects to the SimDK web service, the web service could be configured to proxy communications to a Citrix XenServer. In this way SimDK can emulate vSphere4 for a Citrix Xen environment.
I think SimDK is one of the most exciting pieces of software released in the realm of virtualization in a long time. If you’re interested in learning more about SimDK or want to become involved with the project, please visit the SimDK homepage (a work in progress). In the meantime, if you have any questions, feel free to e-mail me at sakutz at gmail. Thanks!

Happy day for EMC unified customers. A whole bunch of new integration, additional cost savings – all for existing and new EMC customers. Oh, and it’s all free :-)
Here’s the PR, but in usual fashion, I tend to like the nerd version.
Read on for more!
So – without further ado, what’s new and GA?
Here’s a demo, which shows it all in action.
Download the high-resolution video in WMV format here (24MB) and MOV format here (700MB – think something was wrong with the encoding :-).
Things to know:
So – how does the data savings work?
While every approach has it’s advantages, it also has disadvantages. One advantage with this approach is we were able to do it with no impact to any other Celerra function, or limiting the filesystem in any way. It’s also not a function of the Celerra platform – so all, in each member of the family work the same.
With the Celerra and CLARiiON sharing more and more (lots in the iSCSI stack, and also the CBFS layer that handles thin provisioning and is also where this compression occurs) – you can see where this is going next :-)
EMC’s NAS+Block platforms are our fastest growing platform at EMC – enjoying it’s 11th quarter of consecutive double digit growth, and according to IDC, EMC now has north of 50% of the NAS market, more than double the nearest competitor according to their analysis. Today’s announcement fills some holes, extends some leads – in general a good thing for our customers!
It’s been a busy couple of weeks! I was in Vienna, Austria, all last week, and I’m on the US West Coast this week. Even though I’ve been on the go, I’ve still been collecting various virtualization-related posts and tidbits. Here they are for you in Virtualization Short Take #36! I hope you find something useful.
I do have a few other articles in my “things to read list” that I haven’t yet gotten around to reading:
The Official Quest Software Desktop Virtualization Group Blog » Blog Archive » How to Integrate ThinApp with Quest vWorkspace 7.0
DRS Resource Distribution Chart
HP Flex-10 versus Nexus 5000 & Nexus 1000V with 10GE passthrough
That’s it for now. I hope that you’ve found something useful here, and—as always—I’d love to hear your thoughts in the comments below.
This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.
Virtualization Short Take #36
Do you recall the post I published in September of 2009 titled "VMworld 2009 – Thoughts from the show"? If you read this post you may recall that many VI Admins shared a recurring theme that thru the virtualization of the data center they were becoming quasi-storage administrators.
Oh, there's no doubt about it. The task of bringing storage online is a multi-step process that requires the coordination of information and tasks between the storage admin and VI admin teams. Unfortunately for the VI admin there are considerably more tasks related to the basic processes, like the provisioning of storage, that they must complete as compared to the storage admin.
To illustrate this point I have created the following graphic depicting the storage provisioning process.
I don't mean to upset any storage admins here, but this transformation of the role around who's actually the storage admin is spot on. Consider the number of configuration parameters multiplied by the number of nodes, which the VI admin must manage the storage connectivity for. They have their hands full.
What is clear is the model is broken
Going back to last fall's post you may recall that I posed challenges to address this issue to the VI community and VMware's storage partners. To the former I suggested they demand more integration from their storage vendors. To the later I challenged them to step up and eliminate the “Ever Multiplying Complexity” associated with managing the ever-growing shared storage architectures in virtual infrastructures.
As Six months has passed and I believe its time to measure the progress of the industry to integrate and significantly simplify the role of managing storage. I will cover the results of the joint engineering efforts from NetApp and VMware in this short series of posts.
Introducing the new model from NetApp
Since VMworld NetApp has continued to develop the Rapid Cloning Utility (RCU) vCenter plug-in, released the Virtual Storage Console (VSC) vCenter plug-in, and update our storage best practices document TR-3749 (available as download or for purchase as a book). By combining the recent releases of these tools NetApp is delivering a new model for provisioning, managing, and monitoring storage with VMware.
I would like to clarify one point. Just because there is a new model available to you, it does not mean you must change your processes; however, I trust that once we dig into the gains made by the new process, you will be looking to implement the new model sooner rather than later.
Before we begin...
In order to take advantage of the new model you will need to ensure your environment meets the following requirements.
The new model for storage administrators
The technologies available from NetApp provide the means for a new operational model where storage administrators can significantly simply their support for virtual infrastructures.
In this new model storage admins are responsible for configuring the physical storage array, providing data protection and managing the overall utilization of the array. Once the physical architecture is deployed NetApp admins supporting VMware can simply provision pools of ‘raw’ storage resources (aggregates, FlexVols, and storage network interfaces) directly for use in the virtual infrastructure.
(click on the image to view at full size)
This model significantly simplifies the tasks required by the storage admin while allowing the VMware admin to provision and manage datastores and the storage constructs associated with them such as LUN masking, storage I/O path management, etc... directly from physical resources assigned to the virtual infrastructure by the storage admin.
This process begins after the RCU is installed, where the storage administrator may assign storage resources for use by the virtual infrastructure. To assign these resources the storage admin will login into vCenter and open the RCU configuration panel (select a controller, properties, and assign resources).
(click on the image to view at full size)
Once a resource has been assigned it will be used exclusively by the virtual infrastructure. Resources not assigned are ignored and inaccessible by the virtual infrastructure.
To prevent further changes the storage administrator has the highly advisable option lock or restrict the ability to assign additional resources. Checking a box followed by entering a username a password completes this securing process. The account used to lock these settings is stored securely inside the RCU.
That's it! The storage admin has completed all he or she needs to do in order to provide storage services to the VMware environment. No LUNs, no LUN masking, no NFS exports, no multipathing, etc... seriously that's it!
Too simple to be true?
I'll continue this post tomorrow where I will go into significant details around the functionality NetApp delivers to the VI Admins.
While I like to say, "Virtualization Changes Everything," I believe Leonardo da Vinci said it best with, “Simplicity is the ultimate sophistication.”
Lots of back and forth with various folks internally on this in the last couple of days, and thought I would just put it out there:
Despite the obvious simplicity that would come from directly connecting a UCS to an iSCSI, NFS, FC or FCoE target, this is currently (as of this posting on March 3rd, 2010) not possible with just a UCS chassis and UCS 6100 series fabric interconnect.
I’ve had a couple people ask me “why” – after all, they’re just standard SFP+ connectors - you should be able to “plug Tab A into Slot B”.
This is not new info (Scott and Rick have hit on this), but since I got asked so much (twice last week at different customers in a New York City tour), I thought I would summarize it here, as our readerships are not the same.
Answer:
- For FC/FCoE storage – you currently (this is not a hardware limitation, but a case of firmware support) cannot do this because the UCS6100 operates in NPV mode, not NPIV. This means it looks like a “node” not a “switch”. All switching decisions are handled by an upstream switch, for example an MDS or Nexus 5000. This is well discussed by my teammates Rick Scherer here and Scott Lowe here.
- For IP Storage (iSCSI/NFS) – this is currently not supported, and for similar reasons. Scott does a good post on that here.
These are in fact, tightly coupled to VCE efforts – there is an ongoing workstream and integrated Vblock roadmap (the Vblock is viewed as a “product” – ergo, integrated value proposition, integrated roadmap – not constituent elements). That roadmap has an ongoing “reduce cost per VM instance” part to it, and also “differentiated value proposition” part to it (just like any product roadmap does).
Over time, we want to reduce the need for any extra elements – so yes, we’re working on it. This is the greatest pressure as we look at very low-end configurations (we are working on a Vblock Type 0 focused at the entry market).
It’s also notable that the upside of these integrated approaches is that if it’s on the Vblock Bill Of Materials, it’s been validated to work in an integrated fashion (after all, it is a “product” not a “set of products”) – and you don’t need to worry about “did I configure it right”.
These webcasts which EMC and VMware are doing jointly on tier-1 apps in the VMware context are generally very popular.
We did one last week – and the session was recorded. I’m glad it was, as it followed the pattern and was very popular. You can see the recorded session here (just follow the registration, and you get to the “on-demand” page).
Enjoy, and thank you to the VMware and EMC solutions teams who collaborated on the testing, documentation, and webcast.
SharePoint Storage Design Guidance and Virtualization Best Practices
Thanks to a recent tweet by Jason Boche I found a site on VMware Labs’ page to VMware Flings, a series of small applications VMware Labs uses internally and thought other people might find interesting. A few pieces that are already out there, like Onyx, I recognized. But the one that caught my eye was called SVGA Sonar. What does it do?
VGA Sonar is a demo application for SVGADevTap. SVGADevTap is a user-level library that communicates with the VMware SVGA guest driver to provide low-latency notifications of changes to the screen. Sonar was designed to use the devtap API to visualize application drawing patterns by rendering a scaled-down view of the desktop replacing pixel data with color-coded rectangles. As applications update the screen, Sonar presents it’s scaled version of the screen using colors to denote different types of rendering commands and whether this rendering caused a visible change to the screen. This application is called “Sonar” because the Sonar window accumulates transient updates and represents them as fading rectangles like a sonar display.
Can anyone guess what this was used for? Hint: read my previous blog post.

http://social.technet.microsoft.com/wiki/
There is already a bunch of Hyper-V content up there.
I have already had people ask me why we are looking at doing a wiki when we already have Technet + KBs + Blogs + Forums.
There are a number of reasons for this (and to be clear – the Technet wiki is run by a different group at Microsoft – so they probably have more reasons than I have for this) but the biggest advantage that I see is that this is a great opportunity to get “real world best practices” in a place where everyone can access them.
So – if you have found that in your environment there are things that have worked well with Hyper-V, or things that have not, get over to the wiki and write it up for others to learn about it.
Cheers,
Ben
I was browsing through an EMC technical document titled “EMC CLARiiON Integration with VMware ESX Server” (download it here) a little while ago and I came across a phrase in the document that caught my attention:
“VMware ESX/ESXi support both Fibre Channel and iSCSI storage. However, VMware and EMC do not support connecting VMware ESX/ESXi servers to CLARiiON Fibre Channel and iSCSI devices on the same array simultaneously.”
What? No Fibre Channel and iSCSI from the same array to a VMware ESX/ESXi host simultaneously? That piqued my curiosity, so I contacted a few people within EMC to question the veracity of that statement. It turns out that the answer is more complicated than it might seem at first glance.
For those of you who aren’t interested in the deep technical details, here’s the short explanation behind this behavior:
I’m confident that some other array vendors out there will be very quick to jump on this post and harp on this limitation until the cows come home. I would just ask this question: is it really as big of a limitation as it seems? I’ll come back to that question in a moment.
With the short explanation in mind, here are the more in-depth details. If you like the longer, more technical explanation, then read on!
From EMC’s side, the root of the restriction about using both Fibre Channel and iSCSI devices on the same array simultaneously stems from the interaction of host registration and storage groups.
Host registration is a requirement in the CLARiiON world. In order to present storage to a host from a CLARiiON array, you must first register the host’s initiators with the array in Navisphere. Once the host has been registered, then you can proceed with presenting storage to that host. In theory the CLARiiON could operate without registering hosts and initiators, but EMC chose to require registration. EMC made this choice in order to help simplify host management.
Requiring host registration is a bit different than some of other storage arrays on the market. It’s not better or worse—just different. (Remember, pros and cons come from every technology decision.)
If you’re like me, you’re probably wondering at this point how requiring host registration simplifies anything. Instead of having to manage multiple paths, multiple initiators, and individual hosts every time you want to present storage to a host, you only need to register the host—and all of its initiators—and then you can refer to that same object (the host) over and over again as needed. Yes, host registration does mean a bit more work up front, but the idea is that it will save some work down the road. I guess you can think of host registration kind of like defining aliases in your Fibre Channel zoning configuration: it’s a bit more work up front, but it simplifies things later down the road. If you didn’t create device aliases in your Fibre Channel switch, you’d end up having to re-enter Fibre Channel WWPNs multiple times. You create the aliases so that it’s easier later. The same applies to host registration. Again, it’s a matter of choices.
One might also say that registration is security measure, albeit a weak measure. Rather than allow just any Fibre Channel-attached or iSCSI-attached host to see storage, the array requires that it know about the host (via host registration) in order to present storage to the host. This provides an additional layer of security to ensure that only authorized hosts are presented storage from the array.
Now you have a fairly decent idea of why host registration is necessary. So how does host registration occur? Host registration can occur either manually or automatically. Starting with version 4.0, both VMware ESX and VMware ESXi will automatically register with a CLARiiON array running any recent version of FLARE (ESX 3i version 3.5 also supports this form of push registration). FLARE release 28 and earlier will show these hosts as “Manually registered, unmanaged”; starting with FLARE 29, these hosts are listed as “Manually registered, managed”. In either case, the registration occurs automatically. If the host is Fibre Channel-attached, then the Fibre Channel initiators will be included in the automatic registration. The same goes for iSCSI initiators. Normally, this is a good thing because it saves the administrator the extra steps of registering the host with the storage array. (Also, because VMware ESX/ESXi hosts register automatically, there is no need to install the Navisphere Agent.)
In this case, though, the automatic registration causes a problem. Why? This goes back to the second item I said I needed to discuss: storage groups. Specifically, storage groups have two characteristics that come into play here:
Do you see how all the pieces come together? The only way to control which LUNs should be presented via which protocol is to use multiple storage groups—but a host can only be in a single storage group at a time. With only a single host object for any given VMware ESX/ESXi host, that host can only see either Fibre Channel LUNs (by being in a storage group containing Fibre Channel LUNs) or iSCSI LUNs (by being in a storage group containing iSCSI LUNs), but not both. Hence, the statement in the CLARiiON document I referenced in the very beginning of this blog post that outlines using either Fibre Channel or iSCSI but not both. This behavior is required to enforce the single-protocol LUN access required by VMware.
As with all things, there is a workaround. Because it is a workaround, that’s why the RPQ is necessary to get full support.
To work around this problem, you’ll need to ignore the automatic host registration (or disable the automatic host registration) and instead create two manually registered “pseudo-hosts”: one with the Fibre Channel initiators and one with the iSCSI initiators. These “pseudo-hosts” will need fake IP addresses (if they both use the same IP address, Navisphere will treat them as the same host, thus defeating the purpose of the workaround). Put the Fibre Channel initiators into the Fibre Channel storage group(s), and put the iSCSI initiators into the iSCSI storage group(s). Each “pseudo-host” will be able to see LUNs presented to that storage group and therefore would see both Fibre Channel and iSCSI LUNs at the same time. And, as required by VMware, any given LUN would be accessed only via Fibre Channel or iSCSI but not both. Remember that you need to file an RPQ in order to get support on this configuration.
For VMware ESX/ESXi 4.0 hosts (and ESX 3i version 3.5 hosts), you can disable automatic registration using the Disk.EnableNaviReg advanced configuration option. Setting this value to 0 disables the automatic registration with Navisphere. (Here are screenshots for VMware ESX 3i and VMware ESX/ESXi 4.) If you disable the automatic registration, then you only need to manually register the Fibre Channel and iSCSI initiators as separate “pseudo-hosts” and you’re ready to go.
Let me reiterate again that if you are presenting iSCSI LUNs via the Celerra and not the CLARiiON, none of this applies. Presenting Fibre Channel LUNs via the CLARiiON and iSCSI LUNs via the Celerra to the same VMware ESX/ESXi host is fine. This workaround that I’ve described only applies when you want to present some LUNs via Fibre Channel and some LUNs via iSCSI from a CLARiiON to a single VMware ESX/ESXi host.
Earlier you’ll recall that I asked this question: is this really a limitation? There are a couple of viewpoints:
I can see both sides of the coin. Personally, I tend to side more with the second viewpoint and would prefer to see the CLARiiON have the ability to easily present Fibre Channel and iSCSI to the same host, especially when multiple disk pools are involved. I think that CLARiiON engineering is now evaluating this possibility; as more information emerges, I’ll be sure to keep you posted.
Courteous and professional comments, clarifications, or corrections are always welcome!
This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.
VMware ESX, EMC CLARiiON Arrays, and Multiple Protocols
Last week VMware released a patch specifically for the version of Update Manager (VUM) that is included in vSphere 4.0 Update 1.
The new version (build 231675) is needed to fix a couple of bugs affecting capability to upgrade the Cisco Nexus 1000V virtual switch and the time required to scan and patch hosts in a cluster.
Thanks to Yellow Bricks for the news.
Hello all, thought I’d post a little interesting info here.
In ESX 4.0 update 1 (with no patches), there is an interesting bug that affects PCoIP connections on View. I thought I’d illuminate a little information about this bug as I think it does a good job of illustrating how the PCoIP software sender works with VMware’s underlying hypervisor.
In VMware ESX 3.5 Update 4 and vSphere 4.0 Update 1, there is a virtual device that’s part of each VM’s hardware called DevTap. DevTap is a way for other processes to “tap” into the device layer – specifically video and audio. In the case of PCoIP, it’s used to do screen scraping (for the PCoIP sender) as well as audio. You can see an example of this by looking at Device Manager when you connect to a VM with PCoIP:
The audio stuff isn’t terribly interesting, but what it does with the information attained from the video card is.
As described in Scott Davis’s blog article here, PCoIP will use different kinds of codecs for different types of objects. It will also prioritize things like the currently selected window. But how does it obtain this information?
Most remote display protocols either run in the Presentation layer or they scrape data directly from the video buffer. An example of the former is RDP – it reads the data that is supposed to be drawn and draws it to a virtual video device. This is why you can run RDP in a higher resolution than the physical video card on the machine could. An example of the latter would be VNC, which reads the data that’s in the video card’s frame buffer and sends that over the wire. RDP’s advantage in this case is it gets data earlier on in the drawing process, so it can be more descriptive with lines and shapes (write text here, draw a circle here) whereas VNC can only go by what’s already been drawn, so it has to work with bitmaps and images, not instructions on what to draw. On the other hand, because VNC is only scraping the screen, it’s completely platform independent. That’s why you can get VNC Servers for Windows, MAC, Linux… practically anything.
PCoIP is somewhere in the middle. The actual protocol itself was originally a pure hardware solution, so at its core PCoIP is a screen scraping solution like VNC. The difference is PCoIP has many different algorithms it can use to encode different parts of the screen, and so there is a lot of work that’s done to figure out what the best codec to use it for a particular region of the screen. Again, see the linked blog article above for details on this. VMware’s software implementation, though, is Windows only and will not work except on a Virtual Machine running on ESX 3.5u4+ or 4.0u1+. This is because the screen scraping is actually being given “hints” farther up in the stack. One of the interesting behaviors of PCoIP is that it’s aware of where you mouse cursor is and what your selected window is. This means that the window in focus will get updates faster and at a higher priority than background elements. How does it know what you’re doing? That’s from the hints being sent to the PCoIP sender from the View Agent, which does have information from the Presentation Layer.
After it gets the hints and knows where stuff is, the PCoIP sender service then accesses the DevTap device to read the contents of the video buffer. This has the effect of being able to see what’s coming out of the “virtual monitor” in a way that’s more efficient than reading directly from the frame buffer. This is why the software PCoIP sender is only available in a VM – because it needs the DevTap device to access the virtual monitor.
Once the PCoIP connection is made, a “monitor blanking” command is sent to the virtual monitor so that someone with the vSphere client can’t snoop on someone’s virtual machine. Unfortunately, this is where the bug comes in. Sometimes, when the monitor blanking command is issued, or when you resize your PCoIP window and it changes the resolution of the VM, the DevTap device crashes. This is so bad that the only way to recover from it is to kill the VM’s process, or reboot the ESX host. Resetting or Powering off the VM will hang at 95%, as outlined in this forum thread. A user in that thread recently said that the newest patches for ESX 4.0u1 in January may fix this bug, however I’ve seen it occur after installing this patch, and indeed at VMware Partner Exchange they were still fighting this bug in the View lab sessions, so I’m not sure if it’s resolved yet. It is, however, a very rare occurrence so I wouldn’t lose much sleep over it. I just thought that outlining the cause of this bug would be good opportunity to explain some of the reasons why PCoIP is so clever.
DISCLAIMER: I am not a VMware engineer. Some of this information has been obtained from sources in bits and pieces, and some of it was just from deduction and putting the pieces together. The definitive source for all info on PCoIP is Warren Ponder, whose blog is here.
UPDATE: And as of today (March 3rd) VMware JUST release a series of patches for ESX and ESXi 4.0, one of which fixes this particular bug. From the KB article:
Changing the resolution of the guest operating system over a PCoIP connection (desktops managed by View 4.0) might cause the virtual machine to stop responding.
Symptoms: The following symptoms might be visible:
So get patchin’ folks!

In case you missed today's Ziff Davis webcast in which EMC, Microsoft and Sanbolic highlighted our joint solution supporting Enterprise Deployments of Microsoft Hyper-V, let me provide a recap.
Over the last several weeks, the three companies worked together to build a scalable solution to support Hyper-V in Mid-Size to Enterprise deployments. Using key features in Hyper-V including increased scalability such as support for 64 logical processors and improved memory management along with Live Migration makes Microsoft virtualization a viable option for deploying a large number of virtual machines. Sanbolic provides software enabling advanced management features especially when using Hyper-V. Their virtual file system (Melio) is a 64 bit cluster file system that allows up to 200 hosts to access a single LUN while their volume management software (LaScala) can share access to and administer storage volumes spanning multiple storage controllers. Lastly, EMC's CLARiiON storage array is a 64-bit architecture that provides high performance for demanding workloads and supports advanced features such as Ultraflex which allows additional I/O modules to be added to the array with no interruption of service. CLARiiON also supports Enterprise Flash Drives, Fibre Channel drives and SATA drives all in the same array and allows you to automatically add more drives and expand an existing volume without any downtime to the hosts.
So with these three technologies, we were able to test several scenarios. The first was to boot 260 virtual machines all at the same time ensuring no disruption to service to the users. As expected, all VMs booted simultaneously ensuring the solution could provide the necessary resources to support this scenario.
Next we tested volume expansion. After adding additional storage on the CLARiiON array and making it available to the hosts, we used LaScala to expand the volume to the 260 VMs while they were either running or booting and measured performance. During the disk and volume expansion, all VMs continued to operate without any noticeable impact to performance.
Our third test was to migrate virtual machines between parent servers to ensure there was no disruption to service to the application or workloads occurring on the VMs. We were able to migrate VMs using Microsoft Hyper-V R2 Live Migration 10 consecutive times without any impact to the services running on the VMs. Additionally, the average time of migration lasted only 9 seconds!
Lastly we tested snapshots including taking a snapshot of a volume hosting all 260 VMs without any disruption in service to the workloads running on the VMs. This showed that we could capture a disk based backup of the virtual environment in a matter of seconds without impacting the performance of the individual VMs.
If you're interested in hearing the whole webcast you can access it at the Ziff Davis Enterprise eSeminars website. Look under the Past Events section over the next few days. I'll also be sure to post a link to the whitepaper when it is posted soon!
So what's next? We're definitely not going to sit around with our feet up! I'm currently working in the labs with our Proven Solutions team to build out a virtualized SQL Server 2008 lab configuration using Hyper-V. I'll be recording some demos and I'm building our next webcast coming up at the end of the month. I'll post more details as we get closer but I'm sure you'll find it interesting! Expect this one to get more technical with more demos of how to manage your Hyper-V environment. I'll also post some experiences of this lab build out in some upcoming blog posts.
Finally I need to send a congrats to my good buddy Chad Sakac and the rest of our friends North of the Border. Congratulations on a great Olympics and for stashing all that gold. The final hockey game was a tough pill to swallow but I thought the Canadian crowd showed some class during the medal ceremony. Plus it is about time you Hosers are good at something! ;-)
I recently had the opportunity to work on a proof of concept (PoC) in which we wanted to help a customer streamline the processes needed to deploy new hosts and reduce the amount of time it took overall. One of the tools we used in the PoC for this purpose was PXE booting VMware ESX for an automated installation. Here are the details on how we made this work.
Before I get into the details, I’ll provide this disclaimer: there are probably easier ways of making this work. I specifically didn’t use UDA or similar because I wanted to gain the experience of how to do this the “old fashioned” way. I also wanted to be able to walk the customer through the “old fashioned” way and explain all the various components.
With that in mind, here are the components you’ll need to make this work:
Make sure that each of these components is working as expected before proceeding. Otherwise, you’ll spend time troubleshooting problems that aren’t immediately apparent.
First, copy the contents for the VMware ESX 4.0 Update 1 DVD—not the actual ISO, but the contents of the ISO—to a directory on the FTP server. Test it to make sure that the files can be accessed via an anonymous FTP user.
Also go ahead and create a simple kickstart script that automates the installation of VMware ESX. I won’t bother to go into detail on this step here; it’s been quite adequately documented elsewhere. You’ll need to put this kickstart script on the FTP server as well.
At this point, you’re ready to proceed with gathering the PXE boot files.
The first task you’ll need to complete is gathering the necessary files for a PXE boot environment.
First, copy the vmlinuz and initrd.img files from the VMware ESX 4.0 Update 1 ISO image. Since I use a Mac, for me this was a simple case of mounting the ISO image and copying out the files I needed. Linux or Windows users, it might be a bit more complicated for you. These files, by the way, are in the ISOLINUX folder on the DVD image.
Next, you’ll need the PXE boot files. Specifically, you’ll need the menu.c32 and pxelinux.0 files. These files are not on the DVD ISO image; you’ll have to download Syslinux from this web site. Once you download Syslinux, extract the files into a temporary directory. You’ll find menu.c32 in the com32/menu folder; you’ll find pxelinux.0 in the core folder. Copy both of these files, along with vmlinuz and initrd.img, into the root directory of the TFTP server. (If you don’t know the root directory of the TFTP server, double-check its configuration.)
You’re now ready to configure the PXE boot process.
Once the necessary files have been placed into the root directory of the TFTP server, you’re ready to configure the PXE boot environment. To do this, you’ll need to create a PXE configuration file on the TFTP server.
The file should be placed into a folder named pxelinux.cfg under the root of the TFTP server. The filename of the PXE configuration file should be named something like this:
01-<MAC address of network interface on host>
If the MAC address of the host was 01:02:03:04:05:06, the name of the text file in the pxelinux.cfg folder on the TFTP server would be:
01-01-02-03-04-05-06
The PoC in which I was engaged involved Cisco UCS, so we knew in advance what the MAC addresses were going to be (the MAC address is assigned in the UCS service profile).
The contents of this file should look something like this (lines have been wrapped here for readability and are marked by backslashes; don’t insert any line breaks in the actual file):
default menu.c32
menu title Custom PXE Boot Menu Title
timeout 30
label scripted
menu label Scripted installation
kernel vmlinuz
append initrd=initrd.img mem=512M ksdevice=vmnic0 \
ks=ftp://A.B.C.D/ks.cfg
IPAPPEND 1
You’ll want to replace ftp://A.B.C.D/ks.cfg with the correct IP address and path for the kickstart script on the FTP server.
Only one step remains: configuring the DHCP server.
As I mentioned earlier, I used the Windows DHCP server as a matter of ease and convenience; feel free to use whatever DHCP server best suits your needs. There are only two options that are necessary for PXE boot:
066 Boot Server Host Name (specify the IP address of the TFTP server)
067 Bootfile Name (specify pxelinux.0)
In this particular example, I created reservations for each MAC address. Because the values were the same for all reservations, I used server-wide DHCP options, but you could use reservation-specific DHCP options if you wanted different boot options on a per-MAC address (i.e., per-reservation) basis.
Recall that this PoC was using Cisco UCS blades. Thus, in this environment, to prepare for a new host coming online we only had to make sure that we had a PXE configuration file and create a matching DHCP reservation. The MAC address would get assigned via the service profile, and when the blade booted then it would automatically proceed with an unattended installation. Combined with Host Profiles in VMware vCenter, this took the process of bringing new ESX/ESXi hosts online down to mere minutes. A definite win for any customer!
This article was originally posted on blog.scottlowe.org. Visit the site for more information on virtualization, servers, storage, and other enterprise technologies.
PXE Booting VMware ESX 4.0