Archive

Archive for the ‘Opalis’ Category

Automate patch & restart management in the #datacenter using #Microsoft Orchestrator and #wsus #sysctr #automation #mvpbuzz

August 18, 2012 3 comments

Introduction:

I have been working on a very interesting task next week for our cloud which is patch management automation.

One of the challenges we face as service provider or cloud provider if you are not a service provider is the patch management within our infrastructure and the cloud.

for years there have been tools and applications that can push updates from vendors to our servers; WSUS and SCCM are great examples of those, but there has been a missing part of the puzzle.

What about the restart management for those Servers/Application, how do we manage the relationship between servers patches, restart and restart order, let us take a deeper look to that.

Suppose that you have a typical infrastructure; this could be based on the cloud or not, This infrastructure consists of the following:

  • 2 Domain Controllers.
  • 1 SQL cluster; 2 Nodes.
  • 2 IIS Front-End Servers running a web application.
  • 2 TMG 2010 servers.

suppose that you use WSUS/SCCM, specified restart schedule and approved the updates, and waiting for servers restart, you have 2 options here:

  • if you had all the servers using single restart option; this means that all servers will reboot in the same time.
  • configure multiple scheduling based on OU/GPO, servers will restart based on schedules for different roles which is fine.

In the first option IIS servers will usually restart faster than SQL cluster; their web application might not start because SQL is not running, IIS serves might restart before the Domain Controllers, and might find the required credentials needed to start the web applications and same for SQL clusters that might reboot before DC and the SQL cluster fails, at the end of the day; who knows?!

the second option is cool, however you will have a larger maintenance window, you don’t know when servers will finish rebooting so you will have to wait and assign 30 minutes for DC reboot for example, then another 30 minutes then SQL servers reboot…etc, but this hurts your SLA and increases your maintenance window.

The Solution:

Somehow, you know your infrastructure requirements, so you know the restart order and priority for your servers, you need to have this relationship mapping first before anything else; as this will be the foundation.

You don’t need a fancy visio diagram or relationship table, all what you need is a simple table saying for example:

Server Name Restart Order
Server1 1
Server2 2

and this is an example,you can go as much complex as you want.

later you can use System Center Orchestrator to automate your patching and restart based on the relationship you defined, this is a very effective way to save your life and time, Orchestrator can interpret your restart order, force servers that needs restart to restart in the order you specified in the schedule you need or you can kick the hall process manually it doesn’t make a difference.

The How:

Disclaimer: use this article at your own risk, the solution described here is not the complete one, you need to do further testing, customization and modification to be enterprise ready, the scripted, files and workflows here are provided AS-IS without any warranty.

Building the blocks: In this section we explore the high-level architecture of the solution and its components and then we proceed with its implementation.

The requirements is very simple, we are using WSUS to deploy updates to servers, we have a restart order as the above table for example we want to restart our servers according to the above restart order.

The Lab Setup: I am running 1 Domain Controller that also hosts my WSUS server, 1 Orchestrator Server running SQL 2008 and Orchestrator, 4 Servers running Windows 2008 (srv1, srv2, srv3,srv4).

The restart order for servers is as following:

Server Name Restart Order
srv1 1
srv2 3
srv3 4
srv4 2

I mapped this restart order in a simple SQL Database configured as the following:

image

The Runbooks Architecture:

The Orchestrator has 3 RBs defined to achieve what we want:

    1. the first RB is the launcher, it queries the the database using the following simple query: (use test select hostname from restartordertbl order by restartorder), it queries the table and retrieve the server names and order them with their restart order.
    2. the RB then writes the servers with their restart priority to a text file, it will be used by a later RB to query server names from that text file (you can write you own script to step that in SQL or csv file, I used text file for simplicity).
    3. the RB sets counters of no. of rows returned, the the incremental counter used in looping and invokes the Core RB.image
    4. the Core RB is the core RB for this environment, it gets the counters, compare them if they are not equal it knows that it needs to loop and then proceeds with reading from the text file.
    5. you need to know that the link between the compare value action and append line action (the link with the purple color ) performs the actual decision it allows the RB to proceed only if the value is false which means the values are not equal and stops if the values are equal which means the loop is completed or there is no servers returned by the query.
    6. it executes the following powershell script to know if the server is pending reboot or not (

$baseKey = [Microsoft.Win32.RegistryKey]::OpenRemoteBaseKey(“LocalMachine”, “\`d.T.~Ed/{A7DF762F-4857-4114-9AD9-AD7FE15F7148}.LineText\`d.T.~Ed/”)
$key = $baseKey.OpenSubKey(“Software\Microsoft\Windows\CurrentVersion\Component Based Servicing\”)
$subkeys = $key.GetSubKeyNames()
$key.Close()
$baseKey.Close()
If ($subkeys | Where {$_ -eq “RebootPending”})
{
throw “updates”
}
Else
{

})

the scripts queries the pending reboot status of the machine, if the machine is pending reboot then it will break throwing an error, if not it will complete correctly.

  1. The Link between the run powershell action and the restart action (in red color) allows the RB to take the restart path only of the powershell result is failed which is caused by the break event as the server in this case will be pending restart. if not it will take the other path (the green link) which means that server is not pending restart and starts the “Counter Increaser” RB.image
  2. the counter increaser RB is the simplest one, it simply increases the incremental counter and invokes the Core RB looping again.

Things to note:

  • in order to loop in Orchestrator you can’t loop within the RB, you need to use another RB for that this is why I have the Counter Increaser RB.
  • the powershell could restart the machine, but that didn’t work for me so I used the restart action.
  • you can check the link behaviour by selecting a link and click properties.
    Things that needs improvement:

This is a test RBs, we use different RBs in production that meets our specific environment, you will need to modify that above RPs to do:

  • Server checking if the server online or not.
  • the RBs does restart directly, you will need to include sleep time and restart check to make sure that server completed its restart before proceed with the other restart.
  • make the process parallel and maybe restart servers that are not related to others directly.
  • send notification to administrator or customer.
  • run post restart checks to make sure that server completed the reboto and services started successfully.
  • maybe integrate that with SCSM and go with approvals and workflows from there.

you can go epic with this foundation, be dynamic in servers query and database names this can go endless, use this RBs as your foundation and add more and more blocks to meet your infrastructure and customers’ goals, also feel free to comment or ask question I will be glad to do so.

attached below the working RBs they include every thing, make sure to check each step and read description thoroughly, you can download them from https://skydrive.live.com/embed?cid=6B566FD2C47B21C4&resid=6B566FD2C47B21C4%21130&authkey=AB25TJ854Zc4IT0

until later time and happy Eid

Mahmoud

Advertisements

Automating #Linux Machines #provisioning on #Microsoft Hyper-v #Cloud using #opalis #hyperv

April 23, 2011 4 comments

I was assigned the task by one of my customers to automate their Linux machines provisioning on their Hyper-v cloud they are running, they still evaluating the Hyper-v Cloud capabilities, and they were wondering if they can automate the Linux machines provisioning into Hyper-v Servers.

They currently still evaluating it, so the process for the request and automation still not clear in their mind, but the question and request was simple, we want to automate the process of copying and configuring the machine, specially that they are running lots of Linux virtual machines.

the setup:

– 2 Hyper-v nodes running in cluster, each with 128 GB of memory, SCCM,SCOM, SCVMM 2008 R2 SP1, DPM 2010.

– request will come from a help desk or purchasing system, this is not clear yet.

before we start here is some notes for Microsoft guys working on that:

– I spent couple of days trying to figure out how sysprep can be done on the Linux machines and how to script it, the important note that Linux doesn’t has SID related information bounded to the machine, so copying the machine and renaming it will bring a totally new machine to the cloud. reference here.

– the machine name for Linux can be placed and configured in several places, keep in mind that if you used the command hostname to set the Linux machine host name it will be changed to the default name after the restart, to set it permanently, you will need to set the host name on /etc/sysconfig/network file.

– to execute commands remotely you will need to SSH on the machine.

now let us rock n roll:

I have no experience on Linux scripting so steps mentioned here are just guidelines and placeholders for others to use and kick off their implementations, however I don’t claim that those are the best way to do it.

the workflow you will configure will require the following:

– Create a template Linux virtual machine by creating a normal machine on any hyper-v Host.

– Install the Linux Integration components for Hyper-v, the main factor to note that you will need to install the development tools on the machine so it can successfully compile the source.

– after the integration tools installation you will need to assign a static IP to the machine (this will be used by Opalis later to SSH to the machine and run the configuration commands).

– Shut down the machine, from the SCVMM 2008 R2 admin console, copy the virtual machine to the library (if the hyper-v hosts located in different forest or DMZ this can be done by copying it).

now let us start:

– Install Opalis, best video can be found here.

– Import the SCVMM 2008 R2 Integration pack.

– Create the provisioning work flow as following:

image

the work flow will do the following:

– Create a random name that will be assigned to the machine, this is just a placeholder, the machine can be retrieved from text file, SQL DB..etc

image

– Create a VM from the VM template from the SCVMM 2008 R2 Server, and assign the name generated by the previous task to it, the name will be Linux-randomtextvalue

image

to assign the name linux-randometextname, in the vmname field you can pass the results of the previous task by typing, linux- then right click in the field and choose subscribe and choose published data and choose random text results from the previous step.

– the next step will get the vm, make the name as the name linux-randometextname same as previous step.

-the next step will start the vm, and pass the VM ID retrieved by the “Get VM” task, since this task requires VM ID, use the subscribe and published data to pass the VM ID from the “Get VM” Task.

– the link between the “start VM” and next SSH command will wait for 300 seconds or 5 minutes to allow the machine to fully start.

– the next ssh command will ssh to the static IP of the machine, and change the name by altering the file /etc/sysconfig/network and searching it for the default name “localhost.localdomain” and change it with the random text results:

image

the command will be : sed -i ‘s/localhost.localdomain/Linux-{Randome Text from “Generate VM name”}/g’ /etc/sysconfig/network

– the next step will configure the machine to use DHCP commands, same SSH step the command will be: sed -i ‘s/none/dhcp/g’ /etc/sysconfig/network-scripts/ifcfg-eth0

– the next SSH command will restart the VM to apply the settings.

and you are done.

again you can play with the workflow and create you own flow, there is some guides on the internet to automate the request that came from SCSM into Opalis..etc but this article to give you an idea about how generally the Linux machine configuration will be done.

%d bloggers like this: