Unfortunately, I haven’t had the opportunity to talk racing very much on the blog here, but hopefully that is about to change. My Formula First (in the image) had its engine delivered to Autowerks in New Jersey for rebuilding before the 2017 season.
More excitingly, my 2017 season begins at the Indianapolis Motor Speedway (yes, I bolded it – it’s just that exciting) with the Open Wheel World Challenge. When I get my engine back later this spring I look forward to blogging about its re-installation in the car and our preparations for an event that will be, at minimum, something 9-year-old me would have never believed. I’ll be racing at Indy this summer! To be continued…
Thanks to Andy Yun for hosting this month’s T-SQL Tuesday #84: Growing New Speakers. This is a topic near and dear to my heart as I made my first presentation at SQL Saturday 531 in Louisville, KY in August. Since I’m a rookie, I’m not going to pretend that I have any brilliant tips for new speakers beyond the ones that I am sure are being shared throughout the community today. You should definitely practice your demos, you should definitely practice your speaking, and you should definitely go watch other experienced speakers to see how they approach a session, how they handle the room, and how they manage questions. However, none of that matters if you don’t commit to speaking and presenting to the community and that is where I want to focus this post.
Speaking in front of a group of people is not something that comes naturally to me. I am, as some in the community are, a bit of an introvert. I don’t come charging out of bed in the morning fired up to speak to a large room full of people. That said, I am fortunate enough to work at a place where I am encouraged to share my knowledge with the community. While that encouragement is important, my first session taught me two important things.
The first important thing that I learned is that any real world experience you have is useful to the community. While you may think something you are doing is boring or old hat to everybody, it’s not. Everybody is in a different place in their career journey and could be receptive to the information you have to share.
The second important thing I learned is that the community of SQL Server data professionals is full of wonderful, supportive people. Even if it’s not your natural inclination, volunteer to speak. Beyond that, if possible, go to the speaker’s dinner and introduce yourself to people. You may glean some knowledge, you may make some important networking connections, and you may even make some friends.
As I said, some or all of this may not come naturally to every single one of us. In my experience, it’s worth forcing yourself through that wall and out of your comfort zone and presenting to the community. In fact, my experience was so positive I volunteered to do it again and will be presenting two sessions at SQL Saturday 552 in Lincoln, NE on Saturday, November 19. Come see me and introduce yourself!
When the topic for this month’s T-SQL Tuesday was announced here, it really got the wheels turning in my mind. So, in the spirit of this month’s topic, here is my spin on the official topic: “In the 10+ years I have been a database professional, we’re still dealing with a lack of communication between DBAs and developers as well as DBAs and operations staff”. In other words, can’t we all just get along?
Let us go ahead and separate this into two sub-topics: DBAs not talking to developers (and vice-versa) as well as DBAs not talking to operations staff (and vice-versa). While the consequences of each communication gap may be different, in my mind the solution is the same in both cases.
First, both when I was an operations DBA and now that I am a data platform consultant, I would consistently talk to DBAs who say “I don’t need to understand the applications running on my server(s), I just need to keep the lights on.” While that may technically be true by your job description, that will cause two problems, one for that type of DBA and one for the team with which they’re interacting. The problem for the team is that any performance and/or design guidance the team requests from that DBA will be, at best, misguided or, at worst, destructive. Blindly offering generic database advise or approving index or code changes from the team without the DBA(s) having a thorough understanding of the impacts truly does a disservice to the team being supported and will likely eventually lead to the team turning to Google/Bing for “expert” advice rather than their internal DBA staff. The problem for the DBA can be much more significant – if all you’re doing is keeping the lights on you are A) replaceable as you do not have much specialized knowledge and B) replaceable because your job may be automated or outsourced.
Second, I still talk to some database professionals who proudly declare their agnosticism regarding the servers their databases reside on, the network rails their data rides, and the storage their data uses. While I understand this decision, especially given the always-increasing amount of things “SQL Server people” are supposed to know, my experience in a lot of middle of the night phone calls over the years is that “I don’t know” is not a popular answer when it’s 3 AM, there’s something wrong with “the database”, and the company’s creeping ever closer to violating an SLA if the issue isn’t resolved ASAP.
I do understand our frustration, as a data professional, when a non-technical person accuses “the database” of being broken when what’s actually down is the SAN, the network, the battery backup on the UPS during a power outage, etc. That said, this is where having some basic knowledge of the system you support can go a long ways. If your answer to the manager called into the fire alarm call and demanding a response is “I don’t know” that will not reflect well on you. If your answer is “there doesn’t appear to be a SQL Server issue but let’s look at <storage, network, etc.> because I see X in my monitoring tools” that will not only bring about a faster resolution of the issue, most likely, but it will show those up the chain from you that you are a professional who thoroughly understands the environment you’re supporting.
The solution to all these issues is simply this: communicate. In a very real sense, the developers, the DBAs, and the network/storage/operations folks are all in this together because if the application you’re coding/refining/supporting runs into significant issues it can put everybody’s career at risk. Even if management doesn’t realize this and doesn’t facilitate the communication, make it happen for yourself and your colleagues. Break bread with them, have a beer, setup inter-departmental lunch and learns, or just ask them what’s new in their field or if they got any shiny new toys at work. I’ve done all of these things and seen them all work. Even if all else fails, maybe try this after receiving appropriate responses from all parties:
Last Saturday (8/6/16) I had the pleasure of doing my first community presentation at SQL Saturday 531 in Louisville, KY. During a great speakers’ dinner Friday night at Brick House Tavern + Tap I had the opportunity to meet a few members of the #sqlfamily and get some solid advice for my rookie presentation the next day.
My presentation the next day went fairly smoothly despite a visit from the demo demons (courtesy of what seemed to be a flaky system hooked up to the projector) that deprived my session of some of the demos I intended to present. I genuinely appreciated each and every person that attended the session and I hope the Always On tips and tricks I shared are or will be a help to their organizations.
In short, if you’re not attending SQL Saturdays, why not? It’s tons of free training from experienced people and you’ll meet good people as well. And, if you’ve attended a few SQL Saturdays (as I have) and have tips and information to share, why not try your hand at presenting? It was a rewarding, invigorating experience and I can’t wait to present again in the near future.
If I still haven’t convinced you to involve yourself in SQL Saturdays, check out Chris Yates’ blog for a great breakdown of why SQL Saturdays can be one of the best investments you can make in your career.
I’ve seen this particular Always On issue occur at two different customers. Both would occasionally have issues where one (or more) of the replicas would fail to restart their communication within the availability group. The error message was usually something like this: The Database Mirroring endpoint cannot listen on port 5022 because it is in use by another process.
In a large, complex corporate environment, the DBAs may not always have permission to run the netstat commands necessary to figure out what application is “stealing” the port. Even if they’re able to figure out which application is causing the port conflict, it may take a fair amount of time to change that application’s behavior.
Since a communication failure within an AG is usually a “hair on fire” kind of event, a quick fix may be desired. The quickest fix I’ve found is to change the port on which the AG endpoint is listening. While the Microsoft help pages on this contain enough information to put together the right script eventually, the first time this happened to me I really would have liked a blog post specific to this issue that gave me the right script to use.
In keeping with this T-SQL Tuesday’s theme, below is my SQL Server present. Please note that I used 5023 as an example but your choice can be flexible depending on what else is consuming ports on your machine. Hopefully this helps somebody (or me if I manage to travel back in time and encounter this same issue):
ALTER ENDPOINT [<endpoint name>]
STATE = STARTED
AS TCP (LISTENER_PORT = 5023)
ALTER AVAILABILITY GROUP [<AG name>]
MODIFY REPLICA ON ‘<replica name>’ WITH (ENDPOINT_URL = ‘<server name>:5023’)
If this post helped you and you live in the Louisville, KY area, please consider registering for SQL Saturday 531 on August 6, 2016 and attend my Always On session there – you’re guaranteed to get what you pay for since SQL Saturdays are free!
Thanks for reading! This weekend I’m competing in rounds 3 and 4 of the The Hoosier Tire US Formula First Championship Series at Mid-Ohio Sports Car Course so I’ll be interrupting the SQL Server stuff usually posted here with some news about my first event of the 2016 racing season.
I’m honored to have been selected to speak at SQL Saturday #531 in Louisville, KY on August 6. I’ll be presenting the session that accompanies the first blog post on this site (http://bit.ly/290Oamz), “Top 5 Tips to Keep Always On Always Humming and Users Happy”.
If you work with AlwaysOn Availability Groups in a production environment long enough, you will inevitably encounter a situation that forces you to failover your AG. If you’ve just started working with AGs, that failover is likely followed by a developer or application administrator calling you in the middle of the night cheerfully telling you “something is not right with the database”.
I know this because it happened to me in my early days of supporting AlwaysOn AGs in production, and the reason was never because the database itself was not alive and available. Thankfully, while AlwaysOn is absolutely a marketing term, it’s also an accurate description of a properly configured environment. Fortunately, Microsoft provides us a lot of helpful information covering basic setup and configuration of an AlwaysOn AG. That information can be found starting here.
What isn’t centrally provided and what would have prevented a few late night phone calls to my phone over the years was a list of tips to configure the other things an application needs from an AlwaysOn AG configuration to smoothly operate after a failover.
Tip #1 Quorum Votes
While this is more of a server administrator than DBA tip, since it’s likely the DBA’s phone that will ring if this is configured incorrectly, we need to take an interest in what’s happening across our operations center. Correctly establishing server quorum votes, in most environments, is the responsibility of a server administrator. Rest assured, though, when the on-call phone rings at 4 am you’re the one waking up if this is not configured correctly.
While a deep dive into this topic is a blog in and of itself, the short version of the tip is this: if your AG spans datacenters, make sure that each datacenter has an even number of quorum votes in the cluster and that the cluster has a file share witness configured at a third, separate site. Basically, a typical Windows cluster needs a majority of voting cluster nodes online to maintain normal cluster operations. If the votes are evenly spread between two datacenters and you have a witness at a third site breaking the tie, losing just one datacenter will not bring your cluster down. While you can force start a cluster after it loses quorum, that is going to create extra, manual work for you both you and your server administrators and that work will be done under the utmost pressure because your database will be inaccessible to the application via the AG listener even though the servers themselves may be healthy. Trust me when I say it’s better to learn this lesson reading a blog than being woken up in the middle of the night with somebody yelling “we’re down!”
Tip #2 Read Only Routing Lists
An important, but often overlooked tip is to configure a read-only routing list for your AG listener. I typically do this as part of the initial build-out of the AG because, without this, you lose one of the coolest features of AlwaysOn availability groups: the ability to route read-only connections to replicas. That feature can free up the primary to provide much more stable performance to the application(s) it serves. It also greatly decreases the possibility that your phone rings with somebody complaining about read-only queries slowing down (or worse yet, blocking) read/write queries.
Tip #3 Copy User Logins
This tip continues our trip up the stack and (hopefully) solves a problem that is easy to solve but can be very difficult to figure out under pressure. Many of the applications I’ve worked on over the years had a variety of database logins (using both Windows authentication and SQL Server authentication). While this is not ideal, there are often technical and/or business reasons why there are multiple logins (using both types of authentication) that need access to the database. In an AlwaysOn AG, particularly one where you have added replicas after the initial establishment of the AG, the replica addition may not have included setting up all the logins the application needs. In that scenario the failover will appear to have worked fine and led to happy application admins and DBAs…until components of the application begin malfunctioning because they don’t have access to the replica (now primary) database. Trying to fix this under a lot of pressure may lead you on a wild goose chase or three (as the application errors or complaining users may not always point you in the right direction), but if you follow the steps found here it will walk you through the process of synchronizing logins between two servers. This process should be one of the first things you do when setting up a new AG replica instance unless you really and truly enjoy off-hours phone calls about broken applications!
Tip #4 Regular Transaction Log Backups
The last two tips are more focused on keeping your AG happy, healthy, and running smoothly which should help keep you, the DBA, happy and healthy as well. Tip #4 is a general SQL Server maintenance best practice but can take on a bit of added importance when supporting AlwaysOn availability groups. Make sure you have a transaction log backup strategy in place and being actively monitored on all databases, but especially the databases participating in the AG. While the discussion of whether or not you should perform these backups on your primary or replica(s) (AlwaysOn gives you the option to choose the location of these backups) is an in-depth topic best left for another blog entry, regular transaction log backups help make maintenance and recovery of an AG-participating database much easier. While shrinking a transaction log is rarely, if ever, recommended, it’s a much more complicated operation in an AlwaysOn AG. Beyond that, a transaction log whose size is out of control can make database recovery a much longer process simply because of the size of the files you are moving/restoring – which means you are spending many more hours with your phone ringing off the hook with unhappy users. That is something that we’d all like to avoid.
Tip #5 Stagger Index Maintenance
My final tip, #5, should keep your network administrators off your back and your AGs as synchronized as possible. It’s also a fairly simple thing to do. If you have a large database within your AG, make sure you stagger the index maintenance that is performed during your regular index maintenance. Whether you rely on a maintenance plan within SQL Server, homebuilt index maintenance scripts, or one of the many solid index maintenance solutions available within the SQL Server community, ensure that the transaction log traffic your index maintenance generates (which can be surprisingly large) is not causing your remote replicas to fall behind your primary database. If your index maintenance is causing those remote replicas to fall behind by seconds or, worse yet, minutes, simply stagger your index maintenance jobs throughout the off-hours or at least make sure the index maintenance on your large tables is spread throughout the time allotted to you for index maintenance. This means that not only are the users who are using the replicas for ad-hoc querying and/or reports likely receiving up-to-the-second data but, in the event of a failover, your exposure to data loss is as minimal as your network and server infrastructure will allow.
I hope these five tips do indeed help keep your AlwaysOn availability groups, application administrators, and users happy. That should mean some happier DBAs as well! I look forward to taking deeper dives into a couple of these topics here on the blog, but if you enjoyed this post I encourage you to sign up for my December 1 webinar here. It will go deeper into some of these topics and I’ll be able to take questions as well.