Clustaring Interview Qua & Ans

  • Upload
    grsrik

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

  • 7/21/2019 Clustaring Interview Qua & Ans

    1/14

    I will have some few of the questions.. It might have some tricky ...

    -> How will you restart your SQL server on cluster without failover..?

    A : Choose option ( Take offline and Bring online option by right clicking node)

    -> What will you if want to add a disk to the SQL Group cluster ..?

    A : Need to choose Add Dependency option after doing that in Cluster administrator tool (or) in FailoverCluster admin tool from 2008 version

    -> As a DBA how will you design active/active cluster requirement . (i.e), how will you manage resource iffailed over..?

    Please read article from MSDN on this to have better understanding

    -> Steps for failover..?

    A : Please red MSDN on this with full view

    -> Difference between SQLSERVER 2005 and SQLSERVER 2008 Cluster Installation ..?

    A : In sql2005 we have the option of installing SQL in remaining nodes from the primary node ., But insql2008 we need to go separately (Login to the both nodes) for installing SQL cluster .

    Q: If using virtual machines and clustering / failing over at that level (not sql server)

    is there any reason that SQL Server Standard Edition wont work? Someone once

    told us in a sql class that Enterprise Edition was necessary for this.

    Answer from Brent: dont you just love those someone once told us things? Youll want to

    get them to tell you why. Standard Edition works fine in virtual machines. It may not be cost-

    effective once you start stacking multiple virtual machines on the same host, though,

    because you have to pay for Standard Edition for every guest.

    Q: Hi, with mirroring being deprecated and Always On AG only available with

    Enterprise Editionwhat are our HA options going to be with Standard Edition in the

    future? Any ideas if Always On synchronous will make it into Standard?

    Answer from Jeremiah: You have a few HA choices with SQL Server 2012 Standard Edition

    and beyond. Even though mirroring is deprecated, you could feasibly use mirroring in thehope that something new will come out. Obviously, this isnt a viable option. The other HA

    option is to use clustering. SQL Server Standard Edition supports 2 node clusters, so you

    can always use it for HA.

    TRAINING FOR CLUSTERING, ALWAYSON

  • 7/21/2019 Clustaring Interview Qua & Ans

    2/14

    Q: Is there a good resource on setting up a lab environment for a clustering setup?

    Answer from Kendra: Im so glad you asked!

    Matt Velic has put together a guide onhow to build a virtual lab with a Windows Server

    2012 Cluster with Virtual Box and SQL Server 2012 Evaluation Editioncomplete with

    eBook.

    If folks want to test AlwaysOn, I did a video a while back onhow to plan an

    AvailabilityGroup lab,andour AlwaysOn page has a setup checklist PDF.

    HOW TO MANAGE ALWAYSON AVAILABILITY GROUPS

    Q: Did you experience or know split brain scenario in Always On Availability

    Groups that when secondary node is up to take over primary role, the transaction

    becomes inconsistent? And how to avoid it?

    Answer from Brent: Ooo, theres several questions in here. First, theres theconcept of split

    brained clusterswhen two different database servers both believe theyre the master.Windows Server Failover Clustering (WSFC) has a lot of plumbing built in to avoid that

    scenario. When you design a cluster, you set upquorum votingso that the nodes work

    together to elect a leader. In theory, you cant run into a split brain scenario automatically

    but, you can most definitely run into it manually if you go behind the scenes and change

    cluster settings. The simple answer here: education. Learn about how the quorum process

    works, learn the right quorum settings for the number of servers you have, and prepare for

    disaster ahead of time. Know how youll need to react when a server (or an entire data

    center) goes down. Plan and script those tasks, and then you can better avoid split brain

    scenarios.

    Q: Can you recommend any custom policies for monitoring AlwaysOn? Or do thesystem policies provide thorough coverage? Thank you!

    Answer from Brent: I was a pretty hard-core early adopter of AlwaysOn Availability Groups

    because I had some clients who needed it right away. In that situation, you have to go to

    production with the monitoring you have, not the monitoring you want. The built-in stuff just

    wasnt anywhere near enough, so most of my early adopters ended up rolling their own.

    StackOverflows about to share some really fun stuff there, so Id keep an eye

    onBlog.ServerFault.com.You should also evaluateSQL Sentry 7.5s new AlwaysOn

    monitoring- its the only production monitoring Im aware of, although I know all the other

    developers are coming out with updates to their tools for monitoring too.

    Q: Is it wise to have primary availability groups in one server of the nodes and have

    primary groups on another of the servers that form the cluster. Or is it better to have

    all primary groups on server 1 and secondary on server 2?

    Answer from Brent: If you split the primaries onto two different nodes, then you can do

    some load balancing.

    http://mattvelic.com/virtual-lab/http://mattvelic.com/virtual-lab/http://mattvelic.com/virtual-lab/http://mattvelic.com/virtual-lab/http://mattvelic.com/virtual-lab/http://mattvelic.com/virtual-lab/http://mattvelic.com/virtual-lab/http://www.brentozar.com/archive/2012/04/video-how-test-availability-groups-sql-server/http://www.brentozar.com/archive/2012/04/video-how-test-availability-groups-sql-server/http://www.brentozar.com/archive/2012/04/video-how-test-availability-groups-sql-server/http://www.brentozar.com/archive/2012/04/video-how-test-availability-groups-sql-server/http://www.brentozar.com/sql/sql-server-alwayson-availability-groups/http://www.brentozar.com/sql/sql-server-alwayson-availability-groups/http://www.brentozar.com/sql/sql-server-alwayson-availability-groups/http://en.wikipedia.org/wiki/Split-brain_(computing)http://en.wikipedia.org/wiki/Split-brain_(computing)http://en.wikipedia.org/wiki/Split-brain_(computing)http://en.wikipedia.org/wiki/Split-brain_(computing)http://technet.microsoft.com/en-us/library/jj612870.aspxhttp://technet.microsoft.com/en-us/library/jj612870.aspxhttp://technet.microsoft.com/en-us/library/jj612870.aspxhttp://blog.serverfault.com/http://blog.serverfault.com/http://blog.serverfault.com/http://greg.blogs.sqlsentry.net/2013/07/sql-sentry-v75-better-alwayson.htmlhttp://greg.blogs.sqlsentry.net/2013/07/sql-sentry-v75-better-alwayson.htmlhttp://greg.blogs.sqlsentry.net/2013/07/sql-sentry-v75-better-alwayson.htmlhttp://greg.blogs.sqlsentry.net/2013/07/sql-sentry-v75-better-alwayson.htmlhttp://greg.blogs.sqlsentry.net/2013/07/sql-sentry-v75-better-alwayson.htmlhttp://greg.blogs.sqlsentry.net/2013/07/sql-sentry-v75-better-alwayson.htmlhttp://blog.serverfault.com/http://technet.microsoft.com/en-us/library/jj612870.aspxhttp://en.wikipedia.org/wiki/Split-brain_(computing)http://en.wikipedia.org/wiki/Split-brain_(computing)http://www.brentozar.com/sql/sql-server-alwayson-availability-groups/http://www.brentozar.com/archive/2012/04/video-how-test-availability-groups-sql-server/http://www.brentozar.com/archive/2012/04/video-how-test-availability-groups-sql-server/http://mattvelic.com/virtual-lab/http://mattvelic.com/virtual-lab/http://mattvelic.com/virtual-lab/
  • 7/21/2019 Clustaring Interview Qua & Ans

    3/14

    Q: Would you consider Always-ON AG Read-Only replicas as a method to offload or

    load balance reporting? Looks like the Read Intent option acts like a load balancer

    for reading off of those DBs, right?

    Answer from Brent: Offload yes, load balance no. The read intent options give you the

    ability to push read-only queries to a different replica, but theres no load balancing. Your

    clients just hit the first server in the list. If you need true load balancing, youll want to put all

    of the read-only replicas behind a real load balancing appliance.

    WINDOWS CLUSTERING SETUP AND MANAGEMENT

    Q: Where can I find a good list of cluster hotfixes SQL 2008 R2 and perhaps the OS

    as well?

    Jes here. You can go to theUpdate Center for Microsoft SQL Serverto find the latest CU

    and hotfixes. Check the Support pages forWindows Server 2008R2.Updates arent

    released as cluster-specific. This is why its really important to have a test or QAenvironment that is also set up as a cluster, so you know if the cluster services are affected

    at all.

    Q: What is the recommended order/procedure when you have to do Windows

    updates to servers in a cluster?

    Answer from Kendra: Microsoft knew you were gonna ask this! Check out their SQL Server

    failover cluster rolling patch and service pack processKB here. But do yourself a favor and

    always deploy patches to a non-production test cluster first and let them burn in a bit.

    Q: From your previous answers, it sounded like you dont recommend use Windows

    2008 R2 for AlwaysOn. Can you elaborate bit more on why Windows 2012 is better

    suited for this? I need more persuasive power to talk the rest of folks of my companyto use it.

    Answer from Brent: Sure, check out theAlwaysOn Availability Groups Lessons Learned

    videoat the bottom of that page.

    Q: Would you have a single DTC group or multiple groups configured for a 4 instance

    cluster?

    Answer from Kendra: Theres no shortcut here: you have to decide on an instance by

    instance basis. For each instance you gotta determine how much it uses distributed

    transactions, and how impacted it might be if DTC were to temporarily be offline. Review

    Cindy Gross information on DTC to find outpros and cons of different approaches to

    configuring DTC.

    SQL SERVER CLUSTERING WITH VMWARE AND HYPER-V

    Q: Is VMWare HA a good alternative to use instead of a Microsoft Cluster?

    Answer from Jeremiah: The HA choice comes down to where you want your HA to be

    managed. VMware HA pushes the high availability question out of the SQL Server realm

    http://technet.microsoft.com/en-us/sqlserver/ff803383.aspxhttp://technet.microsoft.com/en-us/sqlserver/ff803383.aspxhttp://technet.microsoft.com/en-us/sqlserver/ff803383.aspxhttp://technet.microsoft.com/en-us/windowsserver/cc298325http://technet.microsoft.com/en-us/windowsserver/cc298325http://technet.microsoft.com/en-us/windowsserver/cc298325http://support.microsoft.com/kb/958734http://support.microsoft.com/kb/958734http://support.microsoft.com/kb/958734http://support.microsoft.com/kb/958734http://www.brentozar.com/sql/sql-server-alwayson-availability-groups/http://www.brentozar.com/sql/sql-server-alwayson-availability-groups/http://www.brentozar.com/sql/sql-server-alwayson-availability-groups/http://www.brentozar.com/sql/sql-server-alwayson-availability-groups/http://blogs.msdn.com/b/cindygross/archive/2009/02/22/how-to-configure-dtc-for-sql-server-in-a-windows-2008-cluster.aspxhttp://blogs.msdn.com/b/cindygross/archive/2009/02/22/how-to-configure-dtc-for-sql-server-in-a-windows-2008-cluster.aspxhttp://blogs.msdn.com/b/cindygross/archive/2009/02/22/how-to-configure-dtc-for-sql-server-in-a-windows-2008-cluster.aspxhttp://blogs.msdn.com/b/cindygross/archive/2009/02/22/how-to-configure-dtc-for-sql-server-in-a-windows-2008-cluster.aspxhttp://blogs.msdn.com/b/cindygross/archive/2009/02/22/how-to-configure-dtc-for-sql-server-in-a-windows-2008-cluster.aspxhttp://blogs.msdn.com/b/cindygross/archive/2009/02/22/how-to-configure-dtc-for-sql-server-in-a-windows-2008-cluster.aspxhttp://www.brentozar.com/sql/sql-server-alwayson-availability-groups/http://www.brentozar.com/sql/sql-server-alwayson-availability-groups/http://support.microsoft.com/kb/958734http://support.microsoft.com/kb/958734http://technet.microsoft.com/en-us/windowsserver/cc298325http://technet.microsoft.com/en-us/sqlserver/ff803383.aspx
  • 7/21/2019 Clustaring Interview Qua & Ans

    4/14

    and into the VMware infrastructure. More than anything else, this is a business decision

    just be sure youre happy with the decision of which team is managing your uptime.

    Q: When using a virtualized active/passive 2008R2 cluster with underlying iSCSI

    storage can the nodes by on different hosts or is FoE needed to have nodes on

    different hosts?

    Answer from Brent: Check outVMwares knowledge base article on Microsoft cluster

    support.It lays out your options for iSCSI, FC, FCoE, and more, and separates them by

    shared-disk clustering versus non-shared-disk (AlwaysOn Availability Groups).

    Q: Any thoughts on implementing AlwaysOn in conjunction with a virtual SQL

    environment using VMWare HA/ Site Recovery Manager (SRM)?

    Answer from Kendra: With this level of complexity, when things get tricky its incredi-hard to

    sort out. You gotta have a rockstar team with great processes and communication skills to

    handle problems as they ariseand you are going to hit problems.

    Even if you have the rockstar team, you want to first ask if theres a simpler way to meet

    your requirements with a less risky cocktail of technologies. If you rush into what you

    describe, youll find that your high availability solution becomes your primary cause of

    downtime.

    SHARED STORAGE FOR CLUSTERS

    Q: Was reading a great article from Brent on SQLIO. How does this work on a SQL

    Cluster?

    Answer from Kendra: You run SQLIO against the storage (not the SQL Server instance) soit works the exact same way.

    Q: After Setting up The Cluster and adding the various CLUSTER DATA Drives how

    can I add additional Drives after gaining new internal storage?

    Answer from Kendra: Before you touch production, make sure youve got a lab

    environment. If you dont, check out the link above on how to build one. The exact steps to

    do this are going to vary depending on your version of Windows, your version of SQL

    Server, and exactly what storage youre talking about.

    For new shared storage on Sever 2008 or later, the basic process is presenting the storageto all of the nodes, bringing the drive online on one node, creating a volume, adding the disk

    in the failover cluster, and then adjusting dependencies in the cluster as needed.

    (Dependencies can be adjusted online in SQL Server 2008 and later).

    If you have new non-shared storage that you want to use under tempdb (such as SSDs),

    youve got to make sure that every node inthe cluster has the drives for tempdb online /

    http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1037959http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1037959http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1037959http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1037959http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1037959http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1037959
  • 7/21/2019 Clustaring Interview Qua & Ans

    5/14

    volumed/ formatted/ and configured identically, and then you can move tempdb files over to

    it. You will need to restart SQL Sever to make modified tempdb files recognize the new

    paths.

    SHARDING AND MIRRORING QUESTIONS

    Q: I have a peer to peer replication with 3 nodes (all bidirectional). Very beneficial but

    a big pain to maintain. Is that what the industry feels?

    Answer from Jeremiah: SQL Server peer-to-peer replication solves a very specific need

    the ability to have multiple active SQL Servers where writes can occur and where you can

    have near real-time updates to the other servers. While peer-to-peer replication meets that

    need, it has a relatively heavy price tag in terms of DBA expertise, support, and licensing

    costs. Even experienced teams want to have multiple DBAs on staff to deal with on call

    rotations and, lets face it, while peer-to-peer replication hasnt been deprecated, its a

    difficult feature to work with.

    Q: Ive implemented db sharding on Oracle in several environments. Is there an

    applicable tech in SQL Server?

    Answer from Jeremiah: Sharding is just a buzzword for horizontal partitioning. In a sharded

    database, either the application or a load balancing router/reverse proxy is aware of the

    sharding scheme and sends reads and writes to the appropriate server. This can be

    accomplished with SQL Server, Oracle, MySQL, or even Access. There are no technologies

    from Microsoft and Id be wary of anyone attempting to sell something that Just Works

    database sharding is time consuming, requires deep domain knowledge, and adds

    additional database overhead.

    Q: Currently using SQL 2008 Mirroring. Planning a move to 2012. Your thoughts

    about jumping 2012 and going straight to 2014 Always On technologies?

    Jes here. There were no major changes to Database Mirroring in SQL Server 2012, and I

    dont foresee any coming in 2014. Eventually we dont have a specific version yet

    Mirroring will be deprecated. Read ourAlwaysOn Availability Groups Checklistto get an

    idea of the work involved in setting these upits much more complicated than Mirroring

    before you decide to jump in.

    Microsoft Cluster Interview Questions and Answers>What is Clustering. Briefly define & explain it ?Clustering is a technology, which is used to provide High Availability for mission critical applications. Wecan configure cluster by installing MCS (Microsoft cluster service) component from Add removeprograms, which can only available in Enterprise Edition and Data center edition.

    http://www.brentozar.com/sql/sql-server-alwayson-availability-groups/http://www.brentozar.com/sql/sql-server-alwayson-availability-groups/http://www.brentozar.com/sql/sql-server-alwayson-availability-groups/http://www.brentozar.com/sql/sql-server-alwayson-availability-groups/
  • 7/21/2019 Clustaring Interview Qua & Ans

    6/14

    >Types of Clusters ?In Windows we can configure two types of clusters1. NLB(network load balancing) cluster for balancing load between servers. This cluster will not provideany high availability. Usually preferable at edge servers like web or proxy.2. Server Cluster:This provides High availability by configuring active-active or active-passive cluster. In2 node active-passive cluster one node will be active and one node will be stand by. When active serverfails the application will FAILOVER to stand by server automatically. When the original server backs weneed to FAILBACK the application

    > What is Quorum ?A shared storage need to provide for all servers which keeps information aboutclustered application and session state and is useful in FAILOVER situation. This is very important ifQuorum disk fails entire cluster will fails.

    >Why Quorum is necessary ?When network problems occur, they can interfere with communication between cluster nodes. A small setof nodes might be able to communicate together across a functioning part of a network, but might not beable to communicate with a different set of nodes in another part of the network. This can cause seriousissues. In this split situation, at least one of the sets of nodes must stop running as a cluster. To prevent the issues that are caused by a split in the cluster, the cluster software requires that any set of

    nodes running as a cluster must use a voting algorithm to determine whether, at a given time, that set hasquorum. Because a given cluster has a specific set of nodes and a specific quorum configuration, thecluster will know how many votes constitutes a majority (that is, a quorum). If the number drops belowthe majority, the cluster stops running. Nodes will still listen for the presence of other nodes, in caseanother node appears again on the network, but the nodes will not begin to function as a cluster until thequorum exists again.For example, in a five node cluster that is using a node majority, consider what happens if nodes 1, 2,and 3 can communicate with each other but not with nodes 4 and 5. Nodes 1, 2, and 3 constitute amajority, and they continue running as a cluster. Nodes 4 and 5 are a minority and stop running as acluster, which prevents the problems of a split situation. If node 3 loses communication with othernodes, all nodes stop running as a cluster. However, all functioning nodes will continue to listen forcommunication, so that when the network begins working again, the cluster can form and begin to run.

    > Different types of Quorum in Windows server 2008 ? 1.Node Majority - Used when Odd number of nodes are in cluster.2.Node and Disk Majority - Even number of nodes(but not a multi-site cluster)3.Node and File Share Majority - Even number of nodes, multi-site cluster4.Node and File Share Majority - Even number of nodes, no shared storage

    > Different types of Quorum in Windows server 2003 ? Standard Quorum :As mentioned above, a quorum is simply a configuration database for MSCS, and isstored in the quorum log file. A standard quorum uses a quorum log file that is located on a disk hostedon a shared storage interconnect that is accessible by all members of the cluster.Standard quorums are available in Windows NT 4.0 Enterprise Edition, Windows 2000 Advanced Server,Windows 2000 Datacenter Server, Windows Server 2003 Enterprise Edition and Windows Server 2003Datacenter Edition.

    Majority Node Set Quorums :A majority node set (MNS) quorum is a single quorum resource from aserver cluster perspective. However, the data is actually stored by default on the system disk of eachmember of the cluster. The MNS resource takes care to ensure that the cluster configuration data storedon the MNS is kept consistent across the different disks.Majority node set quorums are available in Windows Server 2003 Enterprise Edition, and WindowsServer 2003 Datacenter Edition.

    >Explain about each Quorum type ? Node Majority: Each node that is available and in communication can vote. The cluster functions only witha majority of the votes, that is, more than half.

  • 7/21/2019 Clustaring Interview Qua & Ans

    7/14

    Node and Disk Majority: Each node plus a designated disk in the cluster storage (the disk witness) canvote, whenever they are available and in communication. The cluster functions only with a majority of thevotes, that is, more than half.Node and File Share Majority: Each node plus a designated file share created by the administrator (thefile share witness) can vote, whenever they are available and in communication. The cluster functionsonly with a majority of the votes, that is, more than half.No Majority: Disk Only: The cluster has quorum if one node is available and in communication with aspecific disk in the cluster storage.

    > How is the quorum information located on the system disk of each node kept in synch? The server cluster infrastructure ensures that all changes are replicated and updated on all members in acluster.

    > Can this method be used to replicate application data as well?No, that is not possible in this version of clustering. Only Quorum information is replicated and maintainedin a synchronized state by the clustering infrastructure.

    > Can I convert a standard cluster to an MNS cluster? Yes. You can use Cluster Administrator to create a new Majority Node Set resource and then, on thecluster properties sheet Quorumtab, change the quorum to that Majority Node Set resource.

    > What is the difference between a geographically dispersed cluster and an MNS cluster? A geographic cluster refers to a cluster that has nodes in multiple locations, while an MNS-based clusterrefers to the type of quorum resources in use. A geographic cluster can use either a shared disk or MNSquorum resource, while an MNS-based cluster can be located in a single site, or span multiple sites.

    > What is the maximum number of nodes in an MNS cluster? Windows Server 2003 supports 8-node clusters for both Enterprise Edition and Datacenter Edition.

    > Do I need special hardware to use an MNS cluster? There is nothing inherent in the MNS architecture that requires any special hardware, other than what isrequired for a standard cluster (for example, there must be on the Microsoft Cluster HCL). However, somesituations that use an MNS cluster may have unique requirements (such as geographic clusters), where

    data must be replicated in real time between sites.

    > Does a cluster aware application need to be rewritten to support MNS?No, using an MNS quorum requires no change to the application. However, some cluster awareapplications expect a shared disk (for example SQL Server 2000), so while you do not need shared disksfor the quorum, you do need shared disks for the application.

    > Does MNS get rid of the need for shared disks? It depends on the application. For example, clustered SQL Server 2000 requires shared disk for data.Remember, MNS only removes the need for a shared disk quorum.

    > What does a failover cluster do in Windows Server 2008 ? A failover cluster is a group of independent computers that work together to increase the availability of

    applications and services. The clustered servers (called nodes) are connected by physical cables and bysoftware. If one of the cluster nodes fails, another node begins to provide service (a process known asfailover). Users experience a minimum of disruptions in service.

    > What new functionality does failover clustering provide in Windows Server 2008 ? New validation feature. With this feature, you can check that your system, storage, and networkconfiguration is suitable for a cluster.Support for GUID partition table (GPT) disks in cluster storage. GPT disks can have partitions larger thantwo terabytes and have built-in redundancy in the way partition information is stored, unlike master bootrecord (MBR) disks.

  • 7/21/2019 Clustaring Interview Qua & Ans

    8/14

    > What happens to a running Cluster if the quorum disk fails in Windows Server 2003 Cluster ? In Windows Server 2003, the Quorum disk resource is required for the Clusterto function. In your example, if the Quorum disk suddenly became unavailableto the cluster then both nodes would immediately fail and not be able torestart the clussvc.In that light, the Quorum disk was a single point of failure in a MicrosoftCluster implementation. However, it was usually a fairly quick workaround toget the cluster back up and operational. There are generally two solutionsto that type of problem.1. Detemrine why the Quorum disk failed and repair.2. Reprovision a new LUN, present it to the cluster, assign it a driveletter and format. Then start one node with the /FQ switch and throughcluadmin designate the new disk resource as the Quorum. Then stop andrestart the clussvc normally and then bring online the second node.

    > What happens to a running Cluster if the quorum disk fails in Windows Server 2008 Cluster ? Cluster continue to work but failover will not happen in case of any other failure in the active node.

    When the physical disks are not powering up or spinning, Clusterservice cannot initialize any quorum resources.Cause: Cables are not correctly connected, or the physical disks arenot configured to spin when they receive power.Solution: After checking that the cables are correctly connected, checkthat the physical disks are configured to spin when they receive power.

    The Cluster service fails to start and generates an Event ID 1034 inthe Event log after you replace a failed hard disk, or change drives forthe quorum resource.

    Cause: If a hard disk is replaced, or the bus is reenumerated, theCluster service may not find the expected disk signatures, andconsequently may fail to mount the disk.Solution: Write down the expected signature from the Descriptionsection of the Event ID 1034 error message. Then follow these steps:1. Backup the server cluster.

    2. Set the Cluster service to start manually on all nodes, and then turnoff all but one node.3. If necessary, partition the new disk and assign a drive letter.4. Use the confdisk.exe tool (available in the Microsoft Windows Server2003 Resource Kit) to write that signature to the disk.5. Start the Cluster service and bring the disk online6. If necessary, restore the cluster configuration information.

  • 7/21/2019 Clustaring Interview Qua & Ans

    9/14

    7. Turn on each node, one at a time.For information on replacing disks in a server cluster, see KnowledgeBase article Q305793, "How to Replace a Disk with Windows 2000 orWindows Server 2003 family Clusters" in the Microsoft KnowledgeBase.Drive on the shared storage bus is not recognized.

    Cause: Scanning for storage devices is not disabled on each controlleron the shared storage bus.Solution: Verify that scanning for storage devices is disabled on eachcontroller on the shared storage bus.Many times, the second computer you turn on does not recognize theshared storage bus during the BIOS scan if the first computer isrunning. This situation can manifest itself in a "Device not ready" error

    being generated by the controller or in substantial delays duringstartup.To correct this, disable the option to scan for devices on the sharedcontroller.Note This symptom can manifest itself as one of several errors, depending

    on the attached controller. It is normally accompanied with a one- totwo-minute start delay and an error indicating the failure of somedevice.Configuration cannot be accessed through Disk Management.

    Under normal cluster operations, the node that owns a quorumresource locks the drive storing the quorum resource, preventing theother nodes from using the device. If you find that the cluster nodethat owns a quorum resource cannot access configuration informationthrough Disk Management, the source of the problem and the solutionmight be one of the following:Cause: A device does not have physical connectivity and power.Solution: Reseat controller cards, reseat cables, and make sure thedrive spins up when you start.Cause: You attached the cluster storage device to all nodes andstarted all the nodes before installing the Cluster service on any node.Solution: After you attach all servers to the cluster drives, you mustinstall the Cluster service on one node before starting all the nodes.Attaching the drive to all the nodes before you have the clusterinstalled can corrupt the file system on the disk resources on theshared storage bus.SCSI or fiber channel storage devices do not respond.

  • 7/21/2019 Clustaring Interview Qua & Ans

    10/14

    Cause: The SCSI bus is not properly terminated.Solution: Make sure that the SCSI bus is not terminated early and thatthe SCSI bus is terminated at both ends.Cause: The SCSI or fiber channel cable is longer than the specificationallows.Solution: Make sure that the SCSI or fiber channel cable is not longerthan the cable specification allows.Cause: The SCSI or fiber channel cable is damaged.Solution: Make sure that the SCSI or fiber channel cable is notdamaged. (For example, check for bent pins and loose connectors onthe cable and replace it if necessary.)Disk groups do not move or stay online pending after move.Cause: Cables are damaged or not properly installed.Solution: Check for bent pins on cables and make sure that all cables

    are firmly anchored to the chassis of the server and drive cabinet.Disks do not come online or Cluster service does not start when a nodeis turned off.Cause: If the quorum log is corrupted, the Cluster service cannot start.Solution: If you suspect the quorum resource is corrupted, see theinformation on the problem "Quorum log becomes corrupted" in Node-to-node connectivity problems.Drives do not fail over or come online.Cause: The drive is not on a shared storage bus.Solution: If drives on the shared storage bus do not fail over or come

    online, make sure the disk is on a shared storage bus, not on anonsystem bus.Cause: If you have more than one local storage bus, some drives inShared cluster disks will not be on a shared storage bus.Solution: If you do not remove these drives from Shared cluster disks,the drives do not fail over, even though you can configure them asresources.Shared cluster disks is in the Cluster Application Wizard.Mounted drives disappear, do not fail over, or do not come online.Cause: The clustered mounted drive was not configured correctly.Solution: Look at the Cluster service errors in the Event Log (ClusSvcunder the Source column). You need to recreate or reconfigure theclustered mounted drive if the description of any Cluster service erroris similar to the following:Cluster disk resource "disk resource": Mount point "mount drive" fortarget volume "target volume" is not acceptable for a clustered diskbecause reason. This mount point will not be maintained by the disk

  • 7/21/2019 Clustaring Interview Qua & Ans

    11/14

    resource.When recreating or reconfiguring the mounted drive(s), follow theseguidelines: Make sure that you create unique mounted drives so that they do

    not conflict with existing local drives on any node in the cluster. Do not create mounted drives between disks on the cluster storage

    device (cluster disks) and local disks. Do not create a mounted drive from a clustered disk to the clusterdisk that contains the quorum resource (the quorum disk). You can,however, create a mounted drive from the quorum disk to a clustereddisk. Mounted drives from one cluster disk to another must be in the same

    cluster resource group, and must be dependent on the root disk.Basic Troubleshooting Steps

    When working with SQL Server failover clustering, remember that theserver cluster consists of a failover cluster instance that runs underMicrosoft Cluster Services (MSCS). The instance of SQL Server mightbe hosted by Microsoft MSCS-based nodes that provide the MicrosoftServer Cluster.If problems exist on the nodes that host the server cluster, thoseproblems may manifest themselves as issues with your failover clusterinstance. To investigate and resolve these issues, troubleshoot a SQLServer failover cluster in the following order:1. Hardware: Review Microsoft Windows system event logs.

    2. Operating system: Review Windows system and application eventlogs.3. Network: Review Windows system and application event logs. Verifythe current configuration against the Knowledge Base article,Recommended Private "Heartbeat" Configuration on a Cluster Server.4. Security: Review Windows application and security event logs.5. MSCS: Review Windows system, application event, and cluster logs.6. SQL Server: Troubleshoot as normal after the hardware, operatingsystem, network, security, and MSCS foundations are verified to beproblem-free.Recovering from Failover Cluster FailureUsually, failover cluster failure is to the result of one of two causes: Hardware failure in one node of a two-node cluster. This hardwarefailure could be caused by a failure in the SCSI card or in the operatingsystem.To recover from this failure, remove the failed node from the failovercluster using the SQL Server Setup program, address the hardware

  • 7/21/2019 Clustaring Interview Qua & Ans

    12/14

    failure with the computer offline, bring the machine back up, and thenadd the repaired node back to the failover cluster instance.For more information, see How to: Create a New SQL Server FailoverCluster (Setup) and How to: Recover from Failover Cluster Failure inScenario 1. Operating system failure. In this case, the node is offline, but is not

    irretrievably broken.To recover from an operating system failure, recover the node and testfailover. If the SQL Server instance does not fail over properly, youmust use the SQL Server Setup program to remove SQL Server fromthe failover cluster, make necessary repairs, bring the computer backup, and then add the repaired node back to the failover clusterinstance.Recovering from operating system failure this way can take time. If

    the operating system failure can be recovered easily, avoid using thistechnique.For more information, see How to: Create a New SQL Server FailoverCluster (Setup) and How to: Recover from Failover Cluster Failure inScenario 2.Resolving Common ProblemsProblem: The Network Name is offline and you cannot connect to SQLServer using TCP/IPIssue 1: DNS is failing with cluster resource set to require DNS.Resolution 1: Correct the DNS problems.

    Issue 2: A duplicate name is on the network.Resolution 2: Use NBTSTAT to find the duplicate name and thencorrect the issue.Issue 3: SQL Server is not connecting using Named Pipes.Resolution 3: To connect using Named Pipes, create an alias using theSQL Server Configuration Manager to connect to the appropriatecomputer. For example, if you have a cluster with two nodes (Node Aand Node B), and a failover cluster instance (Virtsql) with a defaultinstance, you can connect to the server that has the Network Nameresource offline using the following steps:1. Determine on which node the group containing the instance of SQLServer is running by using the Cluster Administrator. For this example,it is Node A.2. Start the SQL Server service on that computer using net start. Formore information about using net start, see Starting SQL ServerManually.3. Start the SQL Server SQL Server Configuration Manager on Node A.

  • 7/21/2019 Clustaring Interview Qua & Ans

    13/14

    View the pipe name on which the server is listening. It should besimilar to \\.\$$\VIRTSQL\pipe\sql\query.4. On the client computer, start the SQL Server ConfigurationManager.5. Create an alias SQLTEST1 to connect through Named Pipes to thispipe name. To do this, enter Node A as the server name and edit thepipe name to be \\.\pipe\$$\VIRTSQL\sql\query.6. Connect to this instance using the alias SQLTEST1 as the servername.Problem: SQL Server Setup fails on a cluster with error 11001Issue: An orphan registry key in[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQLServer\MSSQL.X\Cluster]Resolution: Make sure the MSSQL.X registry hive is not currently in

    use, and then delete the cluster key.Problem: Cluster Setup Error: "The installer has insufficient privilegesto access this directory: \Microsoft SQL Server. The installation cannot continue. Log on asan administrator or contact your systemadministrator"

    Issue: This error is caused by a SCSI shared drive that is not partitioned properly.

    Resolution: Re-create a single partition on the shared disk using the following steps:

    1. Delete the disk resource from the cluster.

    2. Delete all partitions on the disk.

    3. Verify in the disk properties that the disk is a basic disk.

    4. Create one partition on the shared disk, format the disk, and assign a drive letter to the disk.

    5. Add the disk to the cluster using Cluster Administrator (cluadmin).

    6. Run SQL Server Setup.

    Problem: Applications fail to enlist SQL Server resources in a distributed transaction

    Issue: Because the Microsoft Distributed Transaction Coordinator (MS DTC) is not completely configured

    in Windows, applications may fail to enlist SQL Server resources in a distributed transaction. This

    problem can affect linked servers, distributed queries, and remote stored procedures that use

    distributed transactions. For more information about how to configure MS DTC, see Before Installing

    Failover Clustering.

    Resolution: To prevent such problems, you must fully enable MS DTC services on the servers where SQL

    Server is installed and MS DTC is configured.

    To fully enable MS DTC, use the following steps:

    1. In Control Panel, open Administrative Tools, and then open Computer Management.

    2. In the left pane of Computer Management, expand Services and Applications, and then click Services.

    3. In the right pane of Computer Management, right-click Distributed Transaction Coordinator, and

    select Properties.

    4. In the Distributed Transaction Coordinator window, click the General tab, and then click Stop to stop

    the service.

  • 7/21/2019 Clustaring Interview Qua & Ans

    14/14

    5. In the Distributed Transaction Coordinator window, click the Logon tab, and set the logon account NT

    AUTHORITY\NetworkService.

    6. Click Apply and OK to close the Distributed Transaction Coordinator window. Close the Computer

    Management window. Close the Administrative