Kubernetes Anywhere and PhotonOS Template

Experimenting with Kubernetes to orchestrate and manage containers? If you are like me and already have a lot invested in vSphere (time, infra, knowledge) you might be exctied to use Kubernetes Anywhere to deploy it quickly. I won’t re-write the instruction found here:

https://github.com/kubernetes/kubernetes-anywhere

It works with

  • Google Compure Engine
  • Azure
  • vSphere

The vSphere option uses the Photon OS ova to spin up the container hosts and managers. So you can try it out easily with very little background in containers. That is dangerous as you will find yourself neck deep in new things to learn.

Don’t turn on the template!

media_1491484535602.png

If you are like me and *skim* instructions you could be in for hours of “Why do all my nodes have the same IP?” When you power on the Photon OS template the startup sequence generates a machine ID (and mac address). So even though I powered it back off, the cloning processes was producing identical VM’s for my kubernetes cluster. Those not hip to networking this is bad for communication.

Also, don’t try to be a good VMware Admin cad convert that VM to a VM Template. The Kubernetes Anywhere script won’t find it.

IF you do like me and skip a few lines reading (happens right) make sure to check this documenation out on Photon OS. It will help get you on the right track.

https://github.com/vmware/photon/blob/master/docs/photon-admin-guide.md#clearing-the-machine-id-of-a-cloned-instance-for-dhcp

This is clearly marked in the documentation now.

Seizing AD Roles – File under Good to know

So lets say the power goes out and half of the vm’s on your “lab storage that uses local disks” go into an infinite BSOD loop. I was lucky as one of the servers that still worked was a AD Domain Controller with DNS. Since I usually don’t try to fight BSOD’s and just rebuild. I did so. One very helpful page to move the AD roles was this article on seizing the roles. Which I had to do since the server holding the roles was DOA.

 

https://technet.microsoft.com/en-us/library/cc816779(v=ws.10).aspx

 

Enjoy and file this under Good to Know

18e6fo

PureStorage + REST API + Splunk = Fun with Data about Data

A few months back I posted a powershell script to post Pure Storage data directly into VMware vCenter Operations Manager (now called vRealize Operations). Inspiration hit me like a brick when a big customer of mine said, “Do you have a plugin for Splunk?”

He already wrote some scripts in python to pull data from our REST API. He just said, “Sure wish I didn’t have to do this myself.” I took the hint. Now I am not a python person, so I did the best I could with the tools I have.
You will notice that the script is very similar to the one I wrote for vCOPS. That is because open REST API’s rock, if you don’t have one for your product you are wrong. 🙂

The formatting in WordPress ALWAYS breaks scripts when I paste them. So head over to GitHub and download the script today.
https://github.com/2vcps/post-rest2splunk/tree/master

Like before I schedule this as a task to run every 5 minutes. That seems to not explode the tiny Splunk VM I am running in VMware Fusion to test this out.

Dashboards. Check.

wpid1855-media_1429109420445.png

Some very basic Dashboards I created. I am not a Splunk ninja, perhaps you know one? I am sure people that have done this for a while can pull much better visuals out of this data.

wpid1856-media_1429109524852.png
wpid1857-media_1429109617758.png

Pivot Table

wpid1858-media_1429109962843.png

Stats from a Lab array some Averages computed by Splunk.

Gauge Report of Max Latency (that is micro seconds)

wpid1859-media_1429110138347.png

A 1000 of these is 1 millisecond 🙂 pretty nice.

From Wikipedia
A microsecond is an SI unit of time equal to one millionth (0.000001 or 10−6 or 1/1,000,000) of a second. Its symbol is μs. One microsecond is to one second as one second is to 11.574 days. A microsecond is equal to 1000 nanoseconds or 1/1,000 milliseconds.

Even if everything else didn’t help you at least you learned that today. Right?

The link to github again https://github.com/2vcps/post-rest2splunk/tree/master

Build IP Management Server with Ubuntu, GIT and Nmap

Have 25 Vlans in your lab and Storage, Servers, Switches and other equipment all over the place?

Build a good Ubuntu box

wpid1098-media_1358298686321.png
If you need more help on this leave some comments. It is pretty straight forward.
Be sure to select openssh server as a role.
After you login:
Remember to
$sudo apt-get update
$sudo apt-get upgrade
#If you are adventurous
$sudo apt-get dist-upgrade

Install VMware Tools (you ARE doing this in a VM right?)

wpid1099-media_1358299813754.png
From your VI Client set the VMware tools install to start, this will insert the disk for VMware Tools, but unfortunately doesn’t automate the install.
$sudo -s
#apt-get install build-essential
#apt-get install
#apt-get install apache2
The last two steps might not be needed but I do them out of habit. They allow the VMware tools to build the Kernel modules needed for VMXNET drivers.#mount /dev/cdrom /mnt/cd
#cp /mnt/cd/VMwareTools-8.6.0-425873.tar.gz
#tar xvzf /tmp/VMwareTools-8.6.0-425873.tar.gz
#cd vmware-tools-distrib
#./vmware-install.pl
Now Run through the install of VMware tools.#apt-get install nmap
#apt-get install gitweb
#exit
You want to do some of the next steps as your normal user not root.
$cd ~
$mkdir nmaps
$git init nmaps
$sudo vi /etc/gitweb.conf
Modify the project root var to match your path to you nmaps.
wpid1100-media_1358302137956.png
Now restart apache.
$sudo /etc/init.d/apache2 force-reload
wpid1101-media_1358302508614.png
Ok great, the webpage will be up at http://yourhost/gitweb
Now to add your network information:
$nmap -sP 10.10.10.0/25 -oG /pathtoyour/nmapsdirectory/file.map
<update> The above command is to scan a subnet with pings then dump the output to a file.</update>

Nmap – great network scanner, the -sP sets it to “Ping” mode, you don’t need to crawl ports or anything so ping is fine, it will resolve the DNS if that is working.
my example:
$nmap -sP 172.10.0.0/16 -oG /home/lab-admin/nmaps/vlan10.map

For what it is worth a /16 network takes forever. If you know the usable ip’s are a way smaller subset you can specify a range.
$nmap -sP 172.10.0.1-100 -oG /home/lab-admin/nmaps/vlan10.map

Next on the first time you must add the file to the git repository
$cd nmaps
$git add /home/lab-admin/nmaps/vlan10.map
$git commit -m “My Network Maps”

Now we have some content on the gitweb page.

wpid1102-media_1358344480271.png
Click tree to see the files you have added.
wpid1103-media_1358344539639.png
Click on blob to see your IP’s in use.

Let’s automate it with crontab

wpid1104-media_1358345701192.png
As you can see it is fairly easy to get rolling. Before you finish, run the nmap command for each subnet and git add each file to the repository.Now we will tell cron to run the nmap scans on a regular interval and git commit the changes.
$vi ip-scanscript.sh
In this file put each ip scan on its own line. As seen above. Save the changes and exit.
$chmod +x ip-scanscript.sh
$./ip-scanscript.sh
This will create all the files you want in one swoop.
$cd nmaps
$git add *
This will add all the new files in the nmaps directory to the git repository.Edit the script one more time.
$vi ip-scanscript.sh
wpid1105-media_1358346482687.png
By adding the git commit line to the script you can now set this up to automatically run with cron.
$crontab -e
[note]I was having issues with the range line in the screenshot above. Still testing.
[2nd note] change the git commit line to include a ‘-a’ example:
git commit -a -m “My IP Lists”
If you need help with crontab this is where I go when I can’t remember.
http://www.adminschoice.com/crontab-quick-reference
http://www.thegeekstuff.com/2011/07/cron-every-5-minutes/
I set mine to run every 4 hours.Hope this is helpful, not much reason to buy a fancy product when this can be up and running so quickly.

More References:
http://gitref.org/basic/

Book Review: Automating vSphere with VMware vCenter Orchestrator

So to be 100% honest I have had this book on my desk for several months. Just staring at me. Calling my name. VMware press provided this copy to me along with Mike Laverick’s SRM book and so I am finally going to review the first one.

Cody Bunch does an amazing job of breaking down one of the most mystifying yet powerful products hidden in the VMware portfolio. VMware vCenter Orchestrator is almost mythical in the promises of automation of typical tasks of a vSphere administrator. While you can bang your head against the wall for weeks trying to figure out how to properly setup the vOrchestrator server and client I was able to use Cody’s guidance to have to operational and running test workflows in just a few hours (I am a slow reader).

I can’t stress enough the need for automation and orchestration in today’s virtual machine environment. The business is demanding more and more from the Virtualization team and in order to deliver vCenter Orchestrator is a good start since you probably already OWN it.

Hopefully soon there will be an update with information on the vApp version of Orchestrator. Check it out here on Amazon or your favorite book reseller.

Thanks again

Leadership in the Cloud (And everywhere else)

This is really a post about leadership in general, but I like to apply it to our industry. I am totally cool if you take these concepts and apply them elsewhere.

In any work environment there is constant posturing, politicing, conflicting, that has nothing to do with the actual cause of the workplace. I am going to offer a few leadership tips for everyone, not just for managers, vp’s and directors. Tips that we can all put to use.

1. It is not all about you. We all know that “guy” (or girl). Using every oppurtunity to push others down and himself up. Using others backs to climb on never lasts. Being the MVP of a losing team is never my goal, make everyone around you better. The skills involved in doing that will take you further than your daily task knowledge. No one ever says, “Wow, Jon sure can deploy a sweet VM.” If you are known for adding value, contributing and making everyone better that is how what you do will last. Valuing your team as something more than tools to make you look good is a good start.

2. Have a Purpose/Mission. I am here to change the world. Personally and Professionally. I have done jobs and have volunteered with people and organizations where no one knows why they do what they do. If you are making Pizza, make life changing pizza. If you are building next-gen datacenters, do it in a way that will alter life for someone.

3. Lead, Even if you aren’t supposed to. Don’t sit around and wait to be asked to do something leadershippy.

4. Have a Strategy. If you don’t know why you do what you do get that first. Then decide how the world will look when you are done. Impact (well good impact) on people will not happen on accident.

5. If you see a problem be part of the Solution. Stop complaining. There is only so much time in the day. Personally, it is natural for me to complain. I am very good at pointing out faults in everything. I have to consciously make the decision to work on the solutions for things I can change and shut up about the other stuff (for now). Some things just need the proper timing.

6. Community. Jump into the deep end of the pool of community. Make this a core tenant of everything you participate in. You can not do it all by yourself. Community substitues like Twitter and Facebook are a start but go meet in person with some real people. Just an idea.

The most cynical of my readers never started reading this. If you got this far, I hope in your mind you see how this applies to you. Of course any comments are welcome.

Some Reality for us Infrastructure Peeps or Apps are cool too

Don’t’ you just love double titles?

For many years I have been an infrastructure guy. I really liked how the cables, and processors and Memory and blinking lights worked. Applications were often the necessary evil tolerated so that I can play with cool technology. During my own journey toward learning about the cloud it becomes increasingly important to consider the function of the application. Six years ago me would totally punch me in the face right now. Traitor. J

1 – Don’t get your App messed up in my resource buckets of awesomeness

 

So the reality check to the Infrastructure geek in me is this: The application teams really think of what you do as the network. That is why when anything is ever wrong it is always “the network’s” fault. What we love to do is getting abstracted more and more. I will still contend that is very important and very hard to do. Whether you are building reference architectures or deploying a converged infrastructure appliance almost no one but us cares. They just want the data to do their jobs. So while we have really great discussions about speeds and feeds, the guy in the picture below just wants the app. From the hypervisor down we need to design with the application in mind or we will risk becoming like that goth dude locked in the server room on IT Crowd.

 

2 Honey badger don’t care about FCoE

My next post will get into what I have been researching regarding what is out there and hopefully help us (infra. peeps) understand our App/Dev brothers better.

You are probably an Infrastructure person if:

  1. You read this blog.
  2. You work mainly with Virtualization
  3. Storage Admin
  4. Network Admin
  5. You like to make fun of DBA’s

 

No clever title – ESXCLI

I have been missing in action for a few weeks. It is time to catch up for all the lost time. One topic I feel many people don’t know too much about is esxcli. I know how to do what I usually do with esxcli. There is a lot more there for us to explore.

First stop and take a look at the virtuallyGhetto article.

It can be run from the Service Console, the ESXi Tech Support Mode command line, or from the vMA. As William points out if you are running these command from the vMA you need to authenticate individually to each host. He goes on to list some articles that go over the most used case of esxcli, swiscsi.

A couple of quick examples I like to use:

esxcli nmp device setpolicy –device naa.6090a07800c2ea66b8c114050000c00d –psp VMW_PSP_RR

This command changes the policy for a storage device to another path selection policy. In this case it is Round Robin. This is great for when you are rebuilding ESX and the storage is already zoned. ESX will add the storage with the default PSP and changing a few dozen datastores on each host one at a time via the GUI can be VERY tedious.

Then how do I change the default PSP?

esxcli nmp satp setdefaultpsp –psp VMW_PSP_RR –satp VMW_SATP_DEFAULT_AA

This can be modified for different array types after the “—satp” tag or different path policies after the “–psp” tag.

For the VCAP-DCA4 exam I am studying for I wonder how much deeper than this they will go? I would feel most Data Center Administrators need to set up swisci settings and possibly change path policies. Anything I am missing? If you check out Duncan’s article here it will be great to know how to list what is available.

Equallogic, VAAI and the Fear of Queues

Previously I posted on how using bigger VMFS volumes helps Equallogic reduce their scalability issues when it comes to total iSCSI connections. There was a comment about does this mean we can have a new best practice for VMFS size. I quickly said, “Yeah, make em big or go home.” I didn’t really say that but something like it. Since the commenter responded with a long response from Equallogic saying VAAI only fixes SCSI locks all the other issues with bigger datastores still remain. ALL the other issues being “Queue Depth.”

Here is my order of potential IO problems on with VMware on Equallogic:

  1. Being spindle bound. You have an awesome virtualized array that will send IO to every disk in the pool or group. Unlike some others you can take advantage of a lot of spindles. Even then, depending on the types of disks some IO workloads are going to use up all your potential IO.
    Solution(s): More spindles is always a good solution if you have unlimited budget. Not always practical. Put some planning into your deployment. Don’t just buy 17TB of SATA. Get some faster disk and break your Group into pools and separate the workloads into something better suited to the IO needs.
  2. Connection Limits. The next problem you will run into if you are not having IO problems is the total iSCSI connections. In an attempt to get all of the IO you can from your array you have multiple vmk ports using MPIO. This multiplies the connections very quickly. When you reach the limit, connections drop and bad things happen.
    Solution: The new 5.02 firmware increases the total maximum connections. Additionally, bigger datastores means less connections. Do the math.
  3. Queue Depth. There are queues everywhere, the SAN ports have queues. Each LUN has a queue. The HBA has a queue. I would need to defer to a this article by Frank Denneman (a much smarter guy than myself.) That balanced storage design is best course of action.
    Solution(s): Refer to problem 1. Properly designed storage is going to give you the best solution for any potential (even though unlikely) queue problems. In your great storage design, make room for monitoring. Equallogic gives you SANHQ USE IT!!! See how your front end queues are doing on all your ports. Use ESXTOP or RESXTOP to see how the queues look on the ESX host. Most of us will find that queues are not a problem when problem one is properly taken care of. If you still have a queuing problem then go ahead and make a new datastore. I would also request Equallogic (and others) release a Path Selection Policy plugin that uses a Least Queue Depth algorithm (or something smarter). That would help a lot.

So I will repeat my earlier statement that VAAI allows you to make bigger datastores and house more VM’s per store. I will add a caveat, if you have a particular application that needs a high IO workload, give it a datastore.

Update Manager Problem after 4.1 Upgrade

A quick note to hopefully publicize a problem I had which I see is discussed in the VMware Community Forums already.

After building a new vCenter Server and Upgrading the vSphere 4.0 databases for vCenter and Update Manager. I noticed I could not scan hosts that were upgraded to 4.1. To be fair, by upgrading I mean rebuilt with a fresh install but with the exact same name and IP addresses. Seems that the process I took to upgrade has some kind of weird effect in the Update Manager Database. The scans fail almost immediately. I searched around the internet and found a couple of posts on the VMware Forums about the subject. One person was able to fix the problem by removing Update Manager and when reinstalling selecting the option to install a new database. I figured I didn’t have anything important in my UM database so I gave it a try and it worked like a champ.

Right now there is not any new patches for vSphere 4.1 but I have some Extension packages that need to be installed (Xsigo HCA Drivers). I wanted to note that I like the ability to upload extensions directly into Update Manager. This is a much cleaner process than loading the patches via the vMA for tracking and change control purposes.