Using Touch ID on Macbook Pro for sudo Authentication

Full credit goes to Cabel Sasser (@cabel) on this one for sharing the original tip. I’m simply sharing it here and showing the process to prove the awesomeness of this capability.

If you run a MacBook Pro with the Touch ID option, you have already discovered the speed at which you can authenticate for a number of GUI-driven products. Running sudo in the command line does not give you that luxury, usually.

By making a small change

First, you have to edit the /etc/pam.d/sudo file with your editor of choice. It’s a read only file and you need admin privileges to do so. Oh the irony!

I’m going to use sudo vim /etc/pam.d/sudo to open up the file. This prompts me for credentials in the terminal session, as it should:

Add the following to the first line in the file after the comment:

auth sufficient pam_tid.so

You can space it out for consistency with the other lines:

Save the file. It’s read-only, so I have to use w! to save, and then exit back to the shell and close your terminal.

Launch a new terminal session so that you have no cached sudo session credentials and try a new sudo command such as sudo vim /etc/hosts and watch the magic happen:

This should be a nice time saver for you, especially when you use complex passwords…like you should 🙂




Resetting vSphere 6.x ESXi Account Lockouts via SSH

VMware vSphere has had a good security feature added since vSphere ESXi 6.0 to add a root account lockout for safety. After a number of failed login attempts, the server will trigger a lockout. This is a good safety measure for when you have public facing servers and is even important for internally exposed servers on your corporate network. We can’t always assume that it’s external bad actors who are the only ones attempting to breach your devices.

Using the vSphere web client shows us the settings which are used to define the lockout count and duration. The parameters under the Advanced settings are as follows:

Security.AccountLockFailures
Security.AccountUnlockTime

Resetting your Failed Login Attempts with pam_tally2

There is a rather simple but effective tool to help you do this. It’s called pam_tally2 and is baked in with your ESXi installation. The command line to clear the lockout status and reset the count to zero for an account is shown here with the root account as an example:

pam_tally2 --user root --reset

In order to gain access to do this, you will need to have SSH access or console access to your server. Console access could be at a physical or virtual console. For SSH access, you need to use SSH keys to make sure that you won’t fall victim to the lockouts for administrative users. In fact, this should be a standard practice. Setting up the SSH keys is relatively simple and is nicely documented in the Knowledge Base article Allowing SSH access to ESXi/ESX hosts with public/private key authentication (1002866)

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1002866

Uploading a key can be done with the vifs command as shown here:

https://docs.vmware.com/en/VMware-vSphere/6.0/com.vmware.vsphere.security.doc/GUID-392ADDE9-FD3B-49A2-BF64-4ACBB60EB149.html

The real question will come as to why you have the interface exposed publicly. This is a deeper question that we have to make sure to ask ourselves at all times. It’s generally not recommended as you can imagine. Ensuring you always use complex passwords and 2-factor authentication is another layer which we will explore. Hopefully this quick tip to safely reset your accounts for login is a good first step.




Got Logs? Get a PaperTrail: First thoughts

I stumbled upon Papertrail through a Twitter Ad (hey, those things work sometimes!) and figured that I should take a quick look. Given the amount of work I’ve been doing around compliance management and deployment of distributed systems, this seems like it may be an interesting fit. Luckily, they have a free tier as well which means it’s easy to kick the tires on it before diving in with a paid commitment.

The concept seems fairly easy:

The signup process was pretty seamless. I went to the pricing page to see what the plan levels are which also has the Free Plan – Sign Up button nicely planted center of screen:

What I really like about this product is the potential to go by data ingestion rather than endpoints for licensing. Scalability is a concern with pricing for me, so knowing that the amount of aggregate data drives the price was rather comforting to me.

The free tier gets a first month with lots of data followed by a 100 MB per month follow on limit. That’s probably not too difficult to cap out at, so you can easily see that people will be drawn to the 7$ first paid tier which ups the data to 1GB of storage and 1 year of retention. Clearly, at 7 days retention for the free tier, this is meant to just give you a taste and leave you looking for more if the usability is working for you.

First Steps and the User Experience

On completion of the first form, there is a confirmation email. You are also logged in immediately and ready to roll with the simple welcome screen:

Clicking the button to get started brings you to the instruction screen complete with my favorite (read: most despised) method of deploying which is pushing a script into a sudo bash pipe.

There is an option to run each script component which is much more preferred so you can see the details of what is happening.

Once you’ve done the initial setup process, you get a quick response showing you have active events being logged:

Basic logging is one thing for the system, so the next logical step is to up the game a bit and add some application level logging which is done using the remote-rsyslog2 collector. The docs and process to deploy are available inside the Papertrail site as well:

Now that I’ve got both by system and an application (I’ve picked the Apache error log as a source location) working, I’m redirected to see the live results in my Events screen (mildly censored to protect the innocent):

You can highlight some specific events and drill down into the different context views by highlighting and clicking anywhere in the events screen:

Searching the logs is pretty simple with a search bar that uses simple structured search commands to look for content. Searches are able to be saved and stored for reporting and repetitive use.

On the first pass, this looks like a great product and is especially important for you to think about as you look at how to aggregate logs for the purpose of search and retention for security and auditing.

The key will be making sure that you clearly define the firewall and VPC rules to ensure you have access to the remote server at Papertrail and then to make sure that you keep track of the data you need to retain. I’ve literally spent 15 minutes in the app and that was from first click to live viewing of system and application logs. All that and it’s free too.

There is a referral link which you can use here if you want to try it out.

Give it a try if you’re keen and let me know your experiences or other potential products that are freely available that could do the same thing. It’s always good to share our learnings with the community!




The Need for IT Operations Agility: Lessons of WannaCry

There is little doubt that the news of ransomware like the recent outbreak of the WannaCry (aka Wcry, WannaCrypt) taking hold in critical infrastructure hits home with every IT professional. The list of affected clients of any ransomware or critical vulnerability is made even more frightening when it means the shutting down of services which could literally affect people’s health like the NHS is experiencing.

Would it be any different if it were a small hardware chain? What if it was a bank? What if it was your bank, and your money was now inaccessible because of it? The problem just became very real when you thought about that, didn’t it?

Know Your (Agile) Enemy

Organizations are struggling with the concept of more rapid delivery of services. We often hear that the greatest enemy of many products is status quo. It becomes even more challenging when we have bad actors who are successfully adopting practices to deliver faster and to iterate continuously. We aren’t talking Lorenzo Lamas and Jean Claude Van Damme kind of bad actors, but the kind who will lock down hospital IT infrastructure putting lives at risk in search of ransom.

While I’m writing this, the WannaCry ransomware has already evolved and morphed into something more resilient to the protections that we had thought could prevent it from spreading or taking hold in the first place. We don’t know who originally wrote the ransomware but we do know that in the time that we have been watching it that it has been getting stronger. As quickly as we thought we were fighting it off by reducing the attack surface,

The Risks of Moving Slowly

Larger organizations are often fighting the idea of risks of moving quickly with things like patching and version updates across their infrastructure. There are plenty of stories about an operating system patch or some server firmware that was implemented on the heels of its release to find out that it took down systems or impacted them negatively in one way or another. We don’t count or remember the hundred or thousands of patches that went well, but we sure do remember the ones that went wrong. Especially when they make the news.

This is where we face a conundrum. Many believe that having a conservative approach to deploying patches and updates is the safer way to go. Those folks view the risk of deploying an errant patch as the greater worry versus the risk of having a vulnerability exposed to a bad actor. We sometimes hear that because it’s in the confines of a private data center with a firewall at the ingress, that the attack surface is reduced. That’s like saying there are armor piercing bullets, but we just hope that nobody who comes after us has them.

Hope is not a strategy. That’s more than just a witty statement. That’s a fact.

Becoming and Agile IT Operations Team

Being agile on the IT operations side of things isn’t about daily standups. it’s about real agile practices including test-drive infrastructure and embracing platforms and practices that let us confidently adopt patches and software at a faster rate. A few key factors to think about include:

  • Version Control for your infrastructure environment
  • Snapshots, backups, and overall Business Continuity protections
  • Automation and orchestration for continuous configuration management
  • Automation and orchestration at all layers of the stack

There will be an onslaught of vendors using the WannaCry as part of their pitch to help drive the value of their protection products up. They are not wrong in leveraging this opportunity. The reality is that we have been riding the wave of using hope as a strategy. When it works, we feel comfortable. When it fails, there is nobody to blame except for those of us who have accepted moving slowly as an acceptable risk.

Having a snapshot, restore point, or some quickly accessible clone of a system will be a saving grace in the event of infection or data loss. There are practices needed to be wrapped around it. The tool is not the solution, but it enables us to create the methods to use the tool as a full solution.

Automation and orchestration are needed at every layer. Not just for putting infrastructure and applications out to begin with, but for continuous configuration management. There is no way that we can fight off vulnerabilities using practices that require human intervention throughout the remediation process. The more we automate, the more we can build recovery procedures and practices to enable clean rollbacks in the event of a bad patch as well as a bad actor.

Adapting IT Infrastructure to be Disposable

It’s my firm belief that we should have disposable infrastructure wherever possible. That also means we have to enable operations practices which mean we can lose portions of the infrastructure either by accident, incident, or on purpose, with minimal effect on the continuation of production services. These disposable IT assets (software and hardware) enable us to create a full stack, automated infrastructure, and to protect and provide resilience with a high level of safety.

We all hope that we won’t be on the wrong side of a vulnerability. Having experienced it myself, I changed the way that I approach every aspect of IT infrastructure. From the hardware to the application layers, we have the ability to protect against such vulnerabilities. Small changes can have big effects. Now is always the time to adapt to prepare for it. Don’t be caught out when we know what the risks are.