posts

Social Semantic Web

2022.09.19 – I maintain my concern that the web has gotten worse due to closed social media platforms, so I have been thinking a lot lately about decentralized models for social networks - as well as existing open standards that can help to close some of the gaps. In contrast to the community-building I’m most interested in, here are two ongoing culture wars that are on my mind. One is the battle of content creators - particularly those authors of adult content. These folks keep being squeezed out of popular platforms, while their work is copied & exploited by celebrities. Tumblr used to be the primary home of weird fandoms, but a few years back it removed all adult content in an effort to appease Apple with its PG-13 app store rules. Instagram has become one of the more prominent battlegrounds since then, eliminating accounts with reckless abandon. A friend was just complaining about having to open their eighth account after the previous seven have been systematically removed. These creators are being kicked off even when following the platform’s rules, due to overly-aggressive moderation policies. For these folks, data portability would be a massive improvement, and owning their own websites from which they can share content is increasingly-critical to maintaining a fanbase. Many folks are using linktr.ee as a sort of mini-homepage to get around some of these limitations - but still not setting up their own personal websites. The other more sinister war is that of white supremacy + domestic terrorism in America and abroad, where disinformation runs rampant on sites like Facebook and Twitter. Hateful content grows like mushrooms on shit in dark corners such as the *chans, but memes and lies are propagated back to the more mainstream platforms. Decentralization won’t fight the spread of hate in dark corners - and may even exacerbate the growth of the number of corners - but it can potentially fight the recommendation algorithms on mainstream sites showing disinformation to “normal” users. In approaching these issues, I’ve been thinking about three pieces in particular: Content Aggregation, Discovery, and Collaboration.

Content Aggregation

It’s clear that folks don’t want to visit dozens of separate websites to consume content if they can avoid it, which is how we got where we are today. Content aggregation through feeds via Really Simple Syndication (RSS) (or ATOM, of course) provides a middle ground: authors can maintain control of their content from their own blogs & domains, but readers can consume content from a variety of sources in a single place. Google Reader was just about perfect as a feed reader, before Google killed it. It was simple and clean, and it even allowed groups to annotate content together! Over the last year I tried out Feedbin and Feedly, but neither impressed me. Lately I’ve been trying out Inoreader but I will admit that I find the design overwhelming - it feels like it’s trying too hard to be a social media site. Also the price for a “team” is way too high - which makes it a hurdle for collaboration. If there are other options that you like and I should consider, please drop me a note! Most modern feed readers (aka RSS readers, though they typically support more than just RSS) use the OPML standard for importing & exporting lists of feeds that you follow. This makes it much easier than before to switch between them - again, adding to the portability. Several of them also offer fake email inboxes for newsletters. As much as people are switching to Substack and similar platforms, I can’t say I’m a fan of reading email. As a content consumption experience it always feels… invasive, and the formatting is always poor.

Discovery

Finding your current friends on various sites continues to be a painful process. Finding new content and folks to follow also tends to be difficult. These days most of the new folks I find as a result of other friends sharing their posts on Twitter. These days, most folks only talk about their own work on their blogs, but including references to other folks’ content greatly aids discovery. I’ve added a “microblog” of recommended content to my website here - again taking a nod from 90s websites - thinking it could help folks find new ideas and creators.

I remembered that Livejournal supported the Friend-of-a-Friend interchange format (FOAF), and allowed you to export a list of the folks that you followed in a single XML file. Such a file could easily be automated on modern self-run blogging platforms like Jekyll, Hugo, Wordpress, etc. The same source content could generate a “links” page like we had on websites in the 90s. I’m adding this to my list of things to tinker with on this Jekyll site. A smart feed reader could even look for a FOAF file and easily help you find your friends’ blogs, as an alternative to OPML. That could then help you find your friends-of-friends, to suggest additional content to you that may be relevant - for instance, blogs that are followed by at least X% of the people you follow, or that follow you.

Collaboration

Collaboration presents a large series of challenges in a decentralized world. For those of us who runs static personal websites, it’s hard to see what content is referencing yours. It’s even harder to receive comments in a public way that’s coherent to your site - replying to a blog post historically requires an account on whatever site you’re on, which doesn’t really work for static sites built with tools like Jekyll. Wordpress, due to its widespread use and dynamic nature has been rather successful in this area. The pingback mechanism allows even self-hosted sites to receive a notification if another site mentions them. As I understand it, this works through some sort of central repository of content indexes. Some folks have implemented the Disqus comment platform on static sites, but they had some serious security issues in the past. That also doesn’t give an easy way to cross-collaborate - the content is still bound to your website, through a centrally-managed provider. My research on FOAF led me to a number of old scholarly articles on the Social Semantic Web and specifically the Semantically-Interlinked Online Communities (SIOC) format. This format was a way of defining and linking content across multiple sites, which would also allow portability of content. You could write a post on one site and have it federated to other message boards for replies and interaction. SIOC seemed to have been a popular idea around 2004-2008, but then appears to have died off completely. It seems to have been another of the interesting/weird/way-too-complicated RDF-based projects that arose during the short period when the open standards community got obsessed with Linked Data. My initial impression is that it’s an interesting model, but too difficult for laypeople to use. Sadly, though there were tools built to natively work with SIOC on Wordpress, Drupal, and other popular blogging engines, all have disappeared today - most lost due to linkrot. Relatedly, I remain skeptical of the web annotation movement from the same era (e.g. sites like Genius, née RapGenius) due to my work on that W3 committee. The potential for abuse and harassment is simply too great and not taken seriously by the community. As such, I’ve largely rejected the concept, but with some healthy and robust standards-based controls (FOAF? robots.txt?) it could potentially have a place in an RSS reader. (Assuming it would default to opt-in not opt-out!) For the moment, I think the simplest solution might be the easiest. Most web developers are familiar with the <link> metatag and its rel attribute, which allows you to define relationships between web pages. Most commonly, we use these for CSS stylesheets, and links to our own RSS feeds on our pages. A less commonly-known attribute is the rev, or reverse, property; basically it’s the opposite of rel. One could provide a <link rev="child" href="https://thatsite/over/there/"> in a page’s head, as a declaration that the current page is a reply (a “child”) of the thatsite page referenced. I’m actually testing that out on the page you’re currently reading! However, nothing supports this today, so it doesn’t do anything. Again, a clever RSS reader could pull this metadata from an <entry> and use it to assemble a content tree - or even notify the original author if they provided the proper metadata in their feed. A site owner could also use their own FOAF list or OPML to scrape for entries that are replies from friends and list them on a given page. This makes for an opt-in model which leaves control in the hands of the original creator. I’ll consider this for a later Jekyll plugin project if this concept gains traction.

Momentum

Which brings us to the root problem - momentum. FOAF hasn’t gained traction. SIOC died on the vine. All of these more complicated methods didn’t gain mainstream support because they inherently go against the current capitalist model of capturing an audience on one site for increasingly long periods of time. And of course, they’re too complicated for the average person to pick up and use the way they can with just plain old HTML. That being said, my interest here remains in digital communities - and I have some half-baked suspicions that communities have an upper bound on how much they can scale sustainably. So maybe you don’t need enough mass-market appeal for a billion-dollar company, just a simple collection of tools that your community can support. I’ve started with my little civic tech webring as one example of what can be done at a smaller scale, and I’m starting to think about how a webring could become a “group” (FOAF or otherwise). As always, if these topics are interesting to you, please drop me a line. Or, maybe create a blog post about the topic yourself … and let me know about it!

Read This

Making the Web Weirder

2022.09.03

The Social Media Plague

Over the last few years, I’ve been struggling with social media. Growing up in a sleepy college town it was hard to find other weirdos, and the internet provided a new and interesting way to do so. While Usenet, IRC, AOL, MySpace, LiveJournal provided increasingly flexible options to communicate, the thing that brought folks together was a passion for sharing things they love. People were intentional in carving out spaces for communities, and found new things and ideas they could love. (By the way, Katie West has put together a fantastic anthology of stories about finding communities on the internet.) Advertising has been around from almost the beginning, but social media changed things - instead of advertising showing up alongside the content, the content itself began being driven by the need for advertising. Today, we have sites like Facebook, Instagram, Tiktok, and Twitter actively profiting off of disinformation. Years of user research has enabled these companies to deliver a dopamine hit straight to users for rage-clicking on posts. (If this is sounding like paranoid fantasy, note that there’s plenty of thorough research and documentation on the topic.) Sites are designed to drive folks into these walled gardens, to get them to refresh the page a thousand times a day, every spare minute waiting in line spent staring at phones waiting for the next crumb of content. The COVID-19 pandemic physically isolated most people, and this only amplified the drive to use social media as an escape. And it’s making people miserable. It certainly makes me miserable. And when I’m feeling particularly curmudgeonly, I tend to say that it’s destroying democracy.

What Comes Next

During the pandemic, following a long period of work-induced burnout, I’ve spent the last year attempting to shift my energy towards productive efforts & creative endeavors. Spending time to share knowledge about my practice & craft. Making weird shit. To steal a phrase from Marie Kondo, I’ve been working on things that spark joy - at least in myself, but hopefully in others as well. One of those efforts was the Move Carefully and Fix Things stickers, though ironically those started from a long (and very divisive) Twitter thread. Another was the DigitalPolicy.us website. I put up a Minecraft server to play with friends (come join us!). And most recently, I’ve been making very silly t-shirts designs for government IT policies: FISMA image I’ve also been reading more longform writing pieces from smart folks. Not everything needs to be a five-page article, but I’m sure glad that people are putting them out there. But moreover, the discourse emerging from those pieces has been fascinating and insightful. And so, I had a realization: this is how I want to dedicate my time (outside of work, for now), finding more productive ways of building communities around things that we love and care about. At least for a while. The first step was in redesigning this website (more on that in a bit). In the coming weeks and months, I plan to explore what I’m thinking of as Web 1.1: modern takes on Blogs, RSS feeds, Webrings, and other basic means of content creation and sharing.

An Overly-Detailed Breakdown of My Site Redesign

I had a few things in mind when redoing this site. I knew, thematically, that I wanted it to be a throwback to the late 90s era of design, while keeping modern aesthetics. One aspect that was more common back then than today is sharing content off-site, as the inclination nowadays is to get people to your site and keep them there. I wanted to be able to share the neat things I was finding on the web, so I made dedicated space to talk about other folks’ work without needing to overly-editorialize about it. Similarly, I find a lot of great government job postings, and I want folks to know about them because we need more amazing people in government. I used Jekyll’s data files to power these pieces, and integrated them into my RSS feed. I went back to old classic web designs that incorporated mixed content effectively, and started plucking elements I found interesting like blocking and textures. K10K was a major source of inspiration, in addition to my own older website designs. I reverted to Silkscreen and Verdana fonts - which I’d used on my site 15 years before - while retaining the more modern Montserrat for headings (which is a decent free clone of the very-expensive Gotham font from HFJ made famous by the Obama Administration). With CSS flexbox and grid layout methods being supported in most modern browsers, I took a long look at whether I still needed Bootstrap. Aside from layout blocking, the main thing I was using there was the navigation fallback for mobile browsers & small screens. I’d long since abandoned the multi-tier menus, and condensing down the text gave me more than enough space for a mobile browser. So, out went Bootstrap. For the moment, I’ve kept FontAwesome for icons, though eventually that will probably go as well, to be replaced with SVGs. Of course, it wouldn’t be a 90s-themed website if I didn’t have a music player, so of course I had to use MIDI files. However, I quickly learned a few depressing facts: 1) modern web browsers do not support MIDI files and 2) no one just keeps webpages of free midis anymore. I had to go digging through the depths of the internet to find a few suitable tracks. I then found the web-midi-player package, itself built on top of timidity. In retrospect, this was a mistake, as I also have to load patch files for every instrument the player will use; in the end, the downloads here are larger than if I’d just used MP3s. Furthermore, the audio driver used is not supported by most mobile browsers, so folks on their phone cannot enjoy my fine musical selections. At some point in the future, I’ll replace these with recorded MP3s of the MIDI files. Having a MIDI player won’t be very effective if the song stops playing when someone changes pages, so I considered using a pop-out player, but that seemed like it could be less-fun and more-annoying. This seemed like the perfect opportunity to use unpoly – a little Javascript addition that turns static websites into single-page apps by loading just particular chunks of pages into the current page. (If you work with Rails, hotwire works similarly.) This way, I can swap out the main content area and leave the rest of the page alone - just like the early days of “DHTML.” One gotcha here is that for unpoly to work with your browser history (the dreaded back button problem), you have to manually set a configuration to tell it where to load it with a line of Javascript after loading, up.history.config.restoreTargets=[':main']; does the trick here. I’m surprised this isn’t the default. Finally, I sprinkled in a few easter eggs that only the most dedicated spelunkers would find. Of course, it’s not the web if it’s not weird! I do need to go back and spend some time cleaning up a few accessibility issues. I’d also like to add a classic-style links page in the future as well, but few folks are keeping their websites active these days. Maybe if this movement grows that will change!
At the moment, I’m finishing work on my new t-shirt store, but following that I’ll be spending some time on a little RSS reader I’ve been tinkering with. Hopefully some of you folks will get in touch about collaborating on some of these upcoming projects - I’m looking forward to hearing from you all!

Read This

Federal Budget Challenges

2022.03.15

1. Annual Appropriations

One of the main reasons government IT is bad is that it’s chronically underfunded and also not funded in the right way. Generally money is only appropriated for an agency to use withing a single year. They have to spend it in that time or it goes back to Treasury. This is why agencies buy printers, hardward, etc. at the end of the year, to show they used it all. Obviously, most IT projects can’t be completed in a single year. This means that there is a significant risk to starting a major IT improvement or system overhaul if the money might evaporate next year. (Or if Congress decides to delay on passing the appropriations bill for six months, preventing any additional money from being spent!) In addition to the TMF, the MGT Act created IT Working Capital Funds for the 24 CFO Act agencies (if they didn’t have one), which cost savings from IT projects can be funneled into, giving them a 3-year window to use saved money. This is capped at only a couple million dollars for each agency, usually about 3% of total salaries and expenses. (If I recall correctly, only the Department of Labor can sweep unused funding at the end of the year into their IT Working Capital Fund though. Also, of course, non-CFO Act agencies - all of the “smalls” - don’t usually have an IT WCF at all!) Needless to say, this makes major improvements almost impossible. GAO has a series of reports on the needed improvements - but it barely scratches the surface of the problem, highlighting only 10 systems out of THOUSANDS. And it doesn’t even cover the most important ones!

2. Competing Spending Priorities

The three big categories of spending in a government agency typically are: A. Rent; B. Salaries; C. Technology. After the first two get paid, IT finally gets a cut at what’s left. Note that the current “return to work” is a push to fill empty offices to justify constantly-increasing Rent. Salaries haven’t kept up with inflation, but still aren’t going down. That leaves only IT to keep taking a cut because some people want full offices to justify renting the space, instead of downsizing and maximizing remote/telework. (VA & GSA are big exceptions, which have embraced remote and closed offices!) To be fair, there have been some efforts to get IT higher up the priority list for agencies, notably FITARA. However, FITARA doesn’t force agencies to actually modernize, and moreover FITARA only applies to the CFO Act agencies (notice a pattern yet?). The Budget side of The Office of Management and Budget (OMB) is somewhat isolated from the Management side (& IT policy), so priorities don’t always translate. E.g., the main methodology for tracking IT spending & investment is the IT Capital Planning and Investment Control (CPIC) process, which is mostly not connected to the Federal Budget process. Of course, priorities even within the Management side aren’t necessarily coordinated. Although CPIC has risk and performance management elements, it mostly isn’t attached to any of the other ongoing priorities; just look at the latest Executive Orders on Cybersecurity or Customer Experience - no mention of CPIC, and little-to-no funding for these mandates. Even though it seems obvious that work to improve security or satisfaction with services should be tied to funding measures, there are instead lots of silos and political territories sliced up.

3. Disconnected Processes

The budget process itself is a weird game of chicken between various elected, political, and career officials. CIOs & CFOs know the tech debt gap, Department Secretaries/Administrators know the costs, but may not feel comfortable putting forth an accurate budget request. And when an Administration decides it wants to make reductions for, say, political reasons, IT generally is the first place to get cut. Then of course, Congress passes whatever budget it wants, ignoring what the President asks for. (Here’s a non-IT example that’s incredibly frustrating.) There are some really solid members that know their stuff, but most of them are clueless on tech. So when it comes to appropriations, it’s a bit like asking your grandma for that hot new Nintendo game and she buys you this: Tiger Handheld Electronic Soccer Game from the 80s (And I’m not going to get into the toxic “outsource everything tech” lobotomization of government staff issue in this rant. That’s for another day.)

In Conclusion

This is all to say that it’s a big house of cards built on a series of broken processes. Although IT spending has increased steadily year after year, it’s still not enough to keep up with even basic upkeep of key systems. I expect we’ll see more and more high-profile failures and exploits in the near future as a result of these gaps. I do want to say that there are lots of good folks working to make incremental progress, but we won’t see any major revolutions in how services are provided by the government until Congress & the President agree to truly stop the bleeding.

Read This

Cloud Strategy Guide

2021.03.07

Cloud
Strategy Guide
In part one, I discussed many of the myths around cloud use in government. In this article, I will describe critical strategies to address these myths that every organization should embrace before, during, and after moving to the cloud. These strategies are generally intended for civilian Federal agencies of the United States, but the recommendations below apply to any public sector organization - and even some private organizations as well. Both guides are available to download as a single PDF
  1. Chapter 1 - Migrate Pragmatically
  2. Chapter 2 - Plan to Your Budget & Staff
  3. Chapter 3 - Embrace New Security Models
  4. Chapter 4 - Understand What You’re Buying
  5. Chapter 5 - Build a Family Farm
  6. Epilogue - Getting More Help

Chapter 1 - Migrate Pragmatically

The first thing to accept is that not all projects are appropriate for the cloud, and not all organizations have the skills necessary to fully take advantage of the cloud. With that as a starting point, an organization needs to come up with a way to rationalize its application portfolio, to determine what should stay on-premises and what should be modernized.  As a general rule, “lift-and-shift” - moving an application without rewriting it for the cloud environment - is almost never cost-effective for Infrastructure as a Service (IaaS) offerings unless it’s already a very modern system in the first place. On the other hand, basic websites with mostly static content are ideal for moving into Software as a Service (SaaS) or Platform as a Service (PaaS) offerings. The CIO Council’s Application Rationalization Playbook (disclaimer: another document I worked on) is a useful starting point for this evaluation. Specifically, an agency should work up a thorough analysis of alternatives between various SaaS, PaaS, and IaaS offerings against the existing on-prem setup, or a hybrid environment. A major consideration here will be the Total Cost of Ownership (TCO), which should take into account not just service costs, but also staffing, support, and training costs. However, the lowest priced option may not always be the best choice (as I’ll be covering below). Cloud.gov, is an offering from the General Services Administration (GSA) that bundles several Amazon Web Services (AWS) offerings in a government-friendly “procurement wrapper” can make migration even easier for agencies. It’s an excellent platform for small agencies, or for large agencies that just want to prototype a new concept quickly. When you do start moving applications, it’s important to start tagging your assets - accounts, virtual machines, workflows, etc. - as early as possible to make accounting easier. Always include the project name and the customer organization at a minimum. Some providers also allow you to easily isolate a project or office’s services into a resource group, and this can also simplify this process. This is very important to allow easy payback or showback of funds, but for these models remember to include in these costs the TCO aspects not captured - e.g. staff time and contractor resources. I strongly recommend agencies take a very cynical stance on so-called low-code/no-code platforms, customer-relationship management tools (CRMs), and workflow management solutions. Many of you may remember the promises of “Business Intelligence” solutions in decades past, where agencies were fleeced for billions of dollars in configuration costs - these solutions are simply using a new buzzword for the same idea. These all promise to reduce costs but are often vastly more expensive than just building a tool from scratch - and the agency becomes completely locked-in to a single vendor until they replace the application entirely. The brilliant Sean Boots of the Canadian Digital Service has presented a “1-day rule” to help identify these boondoggles.
  Checklist
Rationalize the application portfolio
Don’t lift-and-shift
Use cloud.gov
Properly tag cloud assets
Avoid low-code/no-code/crm snake oil

Chapter 2 - Plan to Your Budget & Staff

The easiest way to avoid risks and unexpected costs is to simplify as much as possible. Civilian agencies should not be investing in bleeding-edge technology solutions - they’re too risky and expensive to maintain. Instead, pick the simplest possible solution that can be supported by your staff. The average agency should be aiming to stay well behind the “hype curve” into the “plateau of productivity.”  Since most of the complexity is hidden from the customer, SaaS and commercial-off-the-shelf (COTS) tools are less risky than PaaS and IaaS options overall (provided you follow the 1-day rule above). This goes beyond just cloud, and applies to most anything you’re building. Most agencies, for instance, also should absolutely not be attempting to build a fancy React/Redux/GraphQL single-page application when a plain Wordpress or Drupal website with a few plugins will fulfill the customer’s needs. Building native mobile applications should be completely avoided by most organizations as these can cost millions of dollars a year just for upkeep - instead they should build mobile-friendly, responsive websites. Any custom application or tool may not be a sustainable solution given the high complexity and cost of engineers. This also means that agencies should be simplifying their requirements to the minimum necessary when comparing alternatives, not just the software itself. Avoiding “one-off” projects and special requests will save massive amounts of time and money. Instead, agencies must be actively investing in their staff. Agencies should allocate two to three times the standard training budget for IT and technology-adjacent staff, including project managers, program managers, and acquisition professionals. Some vendors provide a limited amount complementary training, but inevitably agencies need more than these free offerings. This training should include non-IT topics as well, including diversity awareness training, accessibility, plain language writing, project management, agile development techniques, and budgeting and procurement. GSA offers a variety of programs covering many of these areas. This must also include hands-on training - sitting through a webinar is no replacement for actual practical engineering experience. These staff need to be given the time and flexibility to practice these skills to develop them - building small test projects and trying out tools. The best teams are constantly changing and learning, so setting aside up to 10% or more of the staff’s time just for practice is not unreasonable - some private sector companies set aside 20%. All of these investments will pay off richly for agencies. Also, make sure your staff is cross-trained and able to fill gaps as they occur. As your staff begins to understand the new cloud paradigms, it will be important to modify your existing processes to handle the agility the cloud brings. Instead of slow, end-to-end, waterfall process “monorails”, set operational parameters as “guiderails.” Your acquisition process should be modified so that cloud can be purchased like a utility. You should not need to have a Change Control Board meeting anytime someone wants to create, resize, or destroy a virtual server. Plan a cost range that the entire project will fit within and review as needs change, along with monthly or quarterly portfolio reviews to stay on top of the budget. Instead of codified “gold disk” server images maintained by your team, consider template security rules.
  Checklist
Simplify the requirements and architecture
No mobile apps, avoid single-page webapps
Train and cross-train your staff
Allocate time for personal development
Update processes to set guardrails instead of monorails

Chapter 3 - Embrace New Security Models

Agencies must be able to manage the security of everything they run. Going back to the previous strategy, an agency should not deploy anything it cannot manage, and that goes for security as well. This is equally true in on-premises environments, but new operating models require new security models. Both your operations and security teams will need to be familiar with just about every setting that can be changed in your cloud environment - and how to lock them down to prevent exploitation. Organizations should no longer assume that a solution is secure just because they did an up-front initial review. The Federal government uses a security review process for services and applications known as the Authorization (or Authority) To Operate (ATO), but the implementation varies from agency to agency. Traditionally this is a series of standard security controls that are reviewed, checklist-style, by an agency once every three years. However, agencies that have excelled at cloud security have moved to Continuous Authorization, using monitoring tools to actively verify that the security controls are being met and maintained, twenty-four hours a day and seven days a week. However, these monitoring checks still must evolve with the products being monitored to make sure new vulnerabilities have not appeared outside the scope of existing checks. As per usual with cybersecurity, vigilance is key. Since attackers are constantly evolving their methods, tools that automate security responses as well should be used whenever practical - especially built-in, native from the large vendors that are constantly evolving to meet these threats. To help combat this second issue, the Federal government has been moving away from so-called “castle-and-moat” perimeter-based security methods which only monitor network traffic. Instead, an approach known as Zero Trust has appeared, taking a data-first methodology of protecting systems instead of just the perimeter, verifying user identities in real-time, and allowing staff to only have access to the minimum amount of information necessary to fulfill the task at hand. In this way, when the perimeter is inevitably breached, the data assets contained within are still secure. It also should go without saying that teams should be using multi-factor authentication on all privileged accounts. Whether developers or administrators, using more than just a username and password will dramatically reduce the risk of exploitation. The Federal government has “PIV cards” that are generally used on most devices, but if the vendor does not support them, implementing a token system via any of the commercially-available platforms is fine: Google Authenticator, 1Password, Microsoft Authenticator, and YubiKey are all worth looking at. However, organizations should completely avoid text-message codes sent to phones, as these are easily intercepted. For public customers that will need to login or prove their identies, all U.S. government agencies should be using Login.gov.
  Checklist
Research all product configuration settings
Implement continuous monitoring, not just compliance
Use security automation tools
Leverage zero-trust practices to protect your data
Use MFA & Login.gov

Chapter 4 - Understand What You’re Buying

Cloud isn’t going to make your teeth whiter or your breath fresher or fix all of your problems, regardless of what the salespeople tell you. You need to know exactly what you’re buying. Before making an investment, make sure you fully understand what capabilities you’re purchasing and what parts you - and the vendor - will be responsible for. If your evaluation team does not have technical expertise, bring engineers into the conversation early, to sort the truth from the sales pitch. As discussed in the previous article, you may not be getting autoscaling or load balancing or other features you’ve assumed just happen “automatically” - and if available these features definitely will not be free. You may have to build more “glue” between services than you assume, and someone will have to maintain this connective tissue. Also keep in mind that the government cloud regions (or “govcloud” by some vendors’ naming) provide different versions of these tools than the commercial ones. As a result, not all features or solutions will be available - so again, plan ahead. Though, in most cases, civilian agencies not dealing with highly-sensitive data should consider using the commercial versions whenever possible - the security differences are not so great as to be insurmountable, but the functionality limitations are huge. Before implementing a service, do careful research on the service limits - maximum traffic or number of virtual machines or emails that can be sent, etc.. Do not just trust what you are told by a vendor’s engineers or customer representatives - most of the time, they also do not know about these limits until you run aground on them. You should estimate your expected usage - number of site visits and/or users and/or emails, etc., and actually spend the time to search through user forums to make sure no one has hit a limit related to what you’re doing. Customer Experience (CX) is another area where the private sector has been building people-friendly interfaces into their SaaS solutions, and agencies can skip a lot of the hard work and directly benefit from the results. Metrics and feedback-loops are often built-in as well. Maximizing these built-in elements can radically improve an agency’s public satisfaction scores at little or no additional cost.
  Checklist
Validate assumptions; know your responsibilities
Consider commercial cloud instead of govcloud
Research service limits in advance
Leverage built-in CX tools

Chapter 5 - Build a Family Farm

Given that agency IT budgets continue to be cut, and staffing has not increased in 40 years, agencies are largely unprepared to completely rewrite and replace all of their legacy systems.  Moreover, “IT Modernization” as a concept is an unending pursuit, as in Zeno’s paradox of Achilles chasing the Tortoise, software written today is legacy tomorrow. Agencies will need to use all available funding sources to overcome their deep technical debt, prioritizing those that present the greatest risk: those that are unmaintained, frequently used by customers, and lacking in resilience and redundancy. Under this scrutiny, agencies may find that their public websites are a bigger risk than older backend systems. Also, rather than replacing entire large monolithic systems, they should pull off pieces and replace them independently as resources are available.  This can be done by isolating functions and building microservices, but that approach can often lead to expensive, unnecessary complexity. Agencies should not be afraid to build a newer parallel monolith adjacent to the existing one - again, keep in mind that it’s not the size that’s the concern, but the complexity and sustainability. That all being said, the government does have major shortcomings in redundancy today, and too many systems have a single point of failure. At a minimum, agencies should be using cloud for data backup of critical systems whenever possible. I also strongly recommend agencies consider creating load balancing and caching layers in the cloud in front of on-premise public-facing systems to deal with unexpected loads. One final concern is automation. Many organizations begin their cloud journey with unrealistic goals for maturity. The practice of Infrastructure as Code is incredibly popular at the moment, where we talk about treating virtual servers as “cattle, not pets.” An unprepared agency may immediately think that they need to be using all of the most cutting edge tools and technologies at first, but this would be a critical mistake. Instead, following the principles relating to complexity in the sections above, agencies should aim to create a “family farm” - only automating that which they can realistically manage. For instance, there is absolutely nothing wrong with only using a few virtual machines and load balancers instead of a fully configuration-only architecture. The great thing about cloud is you can evolve as your team grows, but it’s incredibly difficult to reduce complexity you’ve invested in if your team shrinks.
  Checklist
Assess technical debt by risk
Replace monoliths a piece at a time
Don’t over-automate
Use cloud backups and load balancing as soon as possible
Build a small “family farm” to start

Epilogue - Getting More Help

These strategies are a starting point towards a successful cloud rollout. If you run into trouble, want to talk shop with your peers, or would like to share your own strategies and experiences, there are several communities to engage with:
  • The Federal CIO Council Cloud and Infrastructure Community of Practice is the main Federal group for discussing these topics. However, they are currently in the process of changing their charter to allow any U.S. government staff to participate: Federal, state, and local. Membership is free.
  • The ATARC Cloud and Infrastructure Working Group is free and open to any government staff, though private sector companies must pay to be members.
  • Cloud & Coffee (presented by ATARC & MorphWorks) is a biweekly podcast hosted by myself and Chris Oglesby. Each episode, we chat with a guest about their personal experience with technology modernization, and there’s a live Q&A open during the chat. Any ATARC member can participate; old episodes are publicly available on Spotify.

Read This

Cloudbusting

2021.02.28 Title: Cloudbusting. Retro 80s video game (pixel art) style portrait of Kate Bush in front of a car Get in loser, we’re going cloudbusting! Image by Bill Hunt. The “Cloudbuster” was a device invented by William Reich to create clouds and rain by shooting “energy” into the sky through a series of metal rods. Although Reich was paid by many desperate farmers to produce rain, the device was never proven to work. It’s been ten years since the Office of Management and Budget (OMB) released the original Federal Cloud Computing Strategy. I had the opportunity to update this strategy two years ago when I served as the Cloud Policy Lead at OMB. Having spent 20 years in the private sector building bleeding-edge cloud infrastructure for some of the best known companies in the world, I was able to leverage my practical experience in the creation of the 2019 Federal Cloud Computing Strategy, “Cloud Smart”. During the course of my work at OMB, I spoke with hundreds of practitioners, policy experts, and Chief Information Officers (CIOs) across government. From this vantage point, I had an intimate view into the entire Federal technology portfolio and learned that many myths about cloud computing were being accepted as truth. In this article, I’ll debunk key myths about cloud adoption, and explain why - and when - cloud is appropriate for government. These myths are generally intended for civilian Federal agencies of the United States, but the recommendations below apply to any public sector organization - and even some private organizations as well. In part two, I’ll discuss some strategies for overcoming the pitfalls discussed here. Both guides are available to download as a single PDF


Myth 1: Cloud Is Cheaper

The main reason cited by Federal agencies to move to commercial cloud is the promise of cost savings. This myth originated with vendors and was repeated by Congress, eventually becoming a common talking point for Executive Branch agencies. Unfortunately, it is based on false premises and poor cost analyses. In practice, the government almost never saves actual money moving to the cloud - though the capabilities they gain from that investment will usually result in a greater value. At a glance, it can appear that moving applications to the cloud may be cheaper than leaving them in a data center. But in most cases, a Federal agency will not see much, if any, cost savings from moving to the cloud. More often than not, they end up spending many times more on cloud than for comparable workloads run in their data center. Experts have known this was a myth for at least a decade, but the lobbyists and salespeople were simply louder than those who had done the math. First, it’s important to note that most Federal agencies own outright the facilities their data centers are located in. In the 1980s and 1990s, agencies began repurposing existing office space for use as data centers, adding in advanced cooling and electrical systems to support their growing compute needs. This changes the equation for the total cost of ownership because the facilities are already built and can be run relatively cheaply, though they may be partially or fully staffed by contractors due to the constant push to outsource all work. The government has also built a few best-in-breed data centers such as the Social Security Administration’s flagship data center that can compete with some of the most efficient commercial facilities in the world, with solar collectors for electricity generation, and advanced heat management systems for reduced energy usage. However, these super-efficient facilities are only represent a handful of the over 1500 data centers the government owns and operates, and cost half a billion dollars each to build. Second, agencies routinely run their servers and equipment well past the end-of-life to save money. There are no Federal requirements to update hardware. In fact, until recently, Federal data center requirements for efficiency measured the utilization of servers by time spent processing, which disincentivized agencies from upgrading - older hardware runs slower and thus results in a higher utilization rate for a given task than a newer, more efficient server that completes the task quickly. During a budget shortfall, an agency with a data center has the option of skipping a hardware refresh cycle or cutting staff to make up the deficit; meanwhile, an agency that is all-in on cloud loses this option, as they will have to continue paying for licenses, operations and maintenance costs. As a result, agencies will need to future-proof their plans in more innovative ways, or better communicate funding priorities to OMB and Congress. Also, it’s important to realize that once the government does buy hardware, the government owns it outright. When you move your application to a commercial cloud, you’re paying a premium for data storage even if it’s just sitting around and not being actively used - for large amounts of data, cloud costs will quickly skyrocket. The government maintains decades worth of massive data sets - NASA generates terabytes of data per day, and even a tiny agency like the Small Business Administration has to maintain billions of scanned loan documents going back to its inception sixty years ago. This is why some major companies have moved away from commercial cloud and built their own infrastructure instead. I would note that the idea of workload portability  - moving a service between different cloud vendors, generally to get a cheaper cost - is also largely a myth. The cost to move between services is simply too great, and the time spent in building this flexibility will not realize any savings. Moreover, every cloud vendor’s offering is just slightly different from its peers, and if you’re only using the most basic offerings which are identical - virtual servers and storage - you’re missing out on the full value that cloud offers.

Myth 2: Cloud Requires Fewer Staff

Another promise of cloud cost savings is that an agency no longer has to keep data center engineers on staff. These practitioners are usually comparatively cheap to employ in government, and rarely reach a grade above GS-13 ($76K-$99K annual salary) and agencies moving to cloud will instead employ comparatively expensive DevSecOps practitioners, site reliability engineers, and cloud-software engineers to replace them when moving applications to IaaS or PaaS. These types of staff are extremely difficult to hire into government as they make very high salaries in the private sector, well in excess of the highest end of the General Schedule pay scale (GS-15: $104K-138K), even assuming an agency has the budget and staff slots open to create a GS-15 position in the first place. Due to the many flaws in the government hiring process, it also can be very difficult to recruit these people into government, even with the new OPM hiring authorities to streamline this process. An agency that chooses to outsource these skills will often find that contractors may cost even more than hiring capable staff. The agency will still need to have staff with cloud experience to actively manage these staff, and contracts will need to be carefully crafted around concrete outcomes so that the agency is not fleeced by a vendor. Another overlooked cost here is training. New solutions aren’t always easy for agencies to adopt - whether that’s a fancy software development tool or something as simple as a video chat platform. Personally, a day doesn’t go by that I don’t find myself explaining to a customer some aspect of Teams or Sharepoint they don’t know how to use. Agencies often must provide formal training, and of course there’s inevitably a loss of productivity while teams get up to speed on the new tools and solutions. Since many SaaS vendors roll out new features extremely rapidly, this can present a challenge for slow-to-adapt agencies. Although some training is provided free from vendors, this rarely suffices for all of an agency’s needs, so in most cases further training will have to be purchased.

Myth 3: Cloud Is More Secure

A constant refrain is that cloud is safer and more secure, owing to the fact that the servers are patched automatically - meaning that key security updates are installed immediately, rather than waiting for a human to make the time to roll out all of these updates.  For a large enterprise, this is historically a very time-consuming manual process, which automation has improved dramatically. However, the same tools that major corporations use for patching in the Cloud are largely open source and free, and they can be used in an agency’s own data center. Moreover, it’s important to note that cloud does not remove complexity, it only hides it in places that are harder to see.  When it comes to security, this is especially true, as organizations must adapt to highly-specialized security settings that are not always easily found, particularly with the IaaS offerings. These settings are also constantly changing because of the constant-patching of these vendors, and all too often with little notice in the case of SaaS offerings. This “double-edged sword” has resulted in a number of high-profile cloud-related breaches over the last few years - affecting both the public and private sectors alike as we learn best security practices the hard way. Cloud vendors have also been… less than enthusiastic about meeting government security and policy requirements, unless the government is willing to pay a very high premium for the privilege of security. (I talked about this contentious relationship more in my post on Automation Principles.) For instance, as of today no major cloud vendor completely meets the government requirements for IPv6 which have been around for 15 years and which OMB recently revised to try to get them to move faster.

Myth 4: Cloud Is More Reliable

This one is less of a myth and more of an overpromise, or fundamental misunderstanding of the underlying technology. For a long time, one of the main pitches of cloud is that of self-healing infrastructure - when one server or drive fails, a new one is spun up to replace it. Although this is something that can be implemented in the cloud, it’s definitely not the default. Specifically, for IaaS solutions, you have to build that into your application - and you don’t get it for free. Relatedly, many agencies assume that any application put into the cloud will automatically scale to meet any demand. If your agency’s website gets mentioned by the President, let’s say, you wouldn’t want it to collapse due to its newfound popularity. Without building infrastructure designed to handle this, simply being “in the cloud” will not solve this problem. However, solving it in the cloud will likely be faster than waiting for physical servers to be purchased, built, shipped, and installed - assuming you have staff on-hand who can handle the tasks. It is important to keep in mind cloud is, by definition, ephemeral. Servers and drives are often replaced with little-to-no notice. I’ve frequently had virtual machines simply become completely unresponsive, requiring them to be rebooted or rebuilt entirely. When you’re building in the cloud, you should assume that anything could break without warning, and you should have recovery procedures in place to handle the situation. Tools like Chaos Monkey can help you test your recovery procedures. One issue that some of the most seasoned practitioners often miss is that all cloud providers have hard limits on their resources that they are able to sell you. After all, they are just running their own data centers, and there are a fixed number of servers that they have on-hand. I have often encountered these limits in practical, seemingly-simple use cases. For instance, I’ve created applications which needed high-memory virtual servers, where the provider didn’t have enough instances to sell us. During the pandemic response, I also discovered that cloud-based email inboxes have hardcoded, technical limits as to the volume of mail they can receive. I had assumed we could simply buy more capacity but this was not the case, requiring a “Rube Goldberg machine” workaround of routing rules to handle the massive increase associated with a national disaster. There is no question that scalability is a huge benefit, until the practical limits become a liability because of your assumptions.

Myth 5: Cloud Must Be All-or-Nothing

Many organizations assume that the goal is to move everything to a commercial cloud provider.  Both the Government Accountability Office and Congress have stated that the government needs to “get out of the data center business.” However, this is simply not a realistic goal in the public sector - government couldn’t afford to make such a massive move given their very restricted budgets. We also must clarify the concept of “legacy systems,” another frequent talking point. Most Federal agencies that have been around for more than 30 years still have mainframes, and they’re often still running older programming languages such as COBOL, Fortran, and Pascal. Many major industries in the private sector still use these same technologies - most notably, the banking industry still is heavily dependent on these legacy systems. Regardless of the hype about cloud and blockchain for moving money around, 95% of credit card transactions still use a COBOL system, probably running on a mainframe behind the scenes. These systems are not going away any time soon. Now these mainframes usually are not dusty old metal boxes that have been taking up an entire basement room for decades. Often, they’re cutting edge hardware that’s incredibly efficient - and even have all the shiny plastic and glowing lights and advanced cooling systems you’d expect to see on a gamer’s desktop computer. Dollar for dollar, modern mainframe systems can be more cost-effective than cloud for comparable workloads over their lifecycle. It’s also worth noting that they are about a thousand times less likely to be attacked or exploited than cloud-based infrastructure. The code running on these mainframes, on the other hand, is likely to be very old, and it’s almost certainly been written such that it cannot be virtualized or moved to the cloud without rewriting partially or entirely at great expense. Modern programming languages come with their own risks, so finding a sustainable middle path between the ancient and bleeding-edge is important for a successful modernization effort. Due to the considerations above, the future of government infrastructure will remain a hybrid, multi-cloud environment - much to the consternation of cloud vendors.

“… I just know that something good is gonna happen”

Instead of these myths, the best reason to use cloud is for the unrivaled capabilities that these tools can unlock:
  • Agility: being able to quickly spin up a server to try something new is much easier in the cloud, if you have not already created an on-premise virtualized infrastructure. Cloud.gov, an offering from the General Services Administration (GSA) that bundles many Amazon Web Services (AWS) offerings in a government-friendly “procurement wrapper” can make this even easier for agencies.
  • Scalability: the main hallmark of cloud is using this agility to quickly respond to sudden increases in requests to websites and applications. Especially during the COVID-19 pandemic, agencies have taken advantage of this functionality to deal with the dramatic increase in traffic to benefit applications and other services. However, it is critical to note that most cloud services do not scale automatically (another myth covered below).
  • Distributed: most Federal agencies have staff in field offices all over the country, and of course their customers are both at home and abroad. Since the cloud is really just a series of distributed data centers around the world, this can dramatically reduce the latency between the customer and the service. For instance, agencies are using cloud-based virtual private network (VPN) solutions to securely connect their staff to internal networks. Those that have moved to cloud-based email, video chat, and document collaboration tools see an additional speed bump for staying in the same cloud for all of these services.
Of course, we all know that “cloud is just someone else’s data center,” but the government should not be held back by fear, uncertainty, and doubt from someone else holding their data. Cloud technologies have a huge potential to improve Federal technology, when approached with a full knowledge of the complexity and costs. Cloud is not a replacement for good management, however. You can’t buy your way out of risk. Until the government invests in its workforce to make sure that IT can be planned, acquired, implemented, and maintained effectively, we will not see any improvement in the services provided to the American people. Now, Congress just needs to be convinced to fully fund some of these improvements. Next week I’ll share part two, where I will discuss several key strategies for a successful cloud implementation in a government agency.

Read This

Login.gov for Everyone!

2021.02.18 – A little over two years ago, I was walking out of the New Executive Office Building by the White House. I immediately ran into Robin Carnahan, who said to me, “Bill, we should be able to provide Login to cities and states.” (If you haven’t met Robin, let me just make it clear for the narrative here that she’s super-smart and anything she says you should just agree with immediately because she knows what she’s talking about.) As soon as I got back to my desk at the Office of Management and Budget (OMB), I started sending out emails to figure out why the General Services Administration (GSA) was preventing this excellent service from being used by smaller governments. For those of you who don’t know about this hidden gem, Login.gov is a GSA solution to help solve the difficult problem of verifying that a person is who they say they are to receive a government benefit, as well as a solution for logging into government websites. It was created through the combined efforts of USDS and 18F - the two most prominent digital service teams in all of government - and is in use by many Federal agencies today. Today it provides access to government services for over 27 million people! Today, GSA has announced that Login.gov is available for use by local and state governments! (To be clear, I had effectively nothing to do with the actual permission being granted here - sending stern emails had little effect. The victory today belongs entirely to the wonderful, amazing, fantastic team at Login and the bureaucrats who were willing to push to make it happen.) There are, however, still a few restrictions for city and state use. To be eligible, the government agencies must be using Login for a “federally funded program.” This is an arbitrary addition by GSA that, in my opinion, misinterprets the original intent of the legal authority - but I’m not a lawyer and am no longer responsible for these sorts of policy decisions. I am hopeful that this restriction will be removed in the future and this incredible service will be open to all who want it! Moreover, as I’ve written in the past, it is my hope that OMB will mandate the use of Login for all Federal agencies. This is already mandated by law, but OMB is not enforcing the requirement. The most expensive part of the tool is the identity verification step - however, once an identity has been proven, it does not need to be re-proven if the customer wants to use any other service that is using Login. This means that as more organizations sign up for Login, the cost to each decreases. By allowing Federal agencies to maintain their own independent login systems, the costs remain high. Moreover, this presents customers with an inferior experience, as they must sign up for a new account for each website or application. It’s also important to note that most identity verification behind the scenes is using data sources that the government controls and gives to private companies, who then sell the government back its own data in the verification process at a very high premium. Eventually, it would be smarter to allow agencies to exchange the necessary information themselves, cutting out the middleperson, which would decrease the cost to almost nothing. (Congress, of course, could speed this along too with the right legislation.) I’ve heard that the Login team has also been working on a pilot to allow customers to prove their identity in-person at a government facility, which has shown to improve the success rates of the verification process. The Department of Veterans Affairs (VA) uses such a process to help Veterans walk through the process of setting up their online accounts right in the lobby of many VA health clinics. The US Postal Service also performed a similar pilot several years ago, where anyone could stop by a post office and have them review their documents, or even let their postal carrier perform the review when they drop off the day’s mail, allowing them to reach almost every single person in the country! Detractors still complain about the cost of Login.gov, and consider that a reason to not require it, even though the cost would be reduced if it was mandated. Even so, if the Federal government agrees that this is the tool that agencies should be using, then it should be treated like a Public Good - like a library or park. To that end, Congress could pass appropriations dedicated to funding this critical program, for instance as part of President Biden’s proposal for TTS Funding. However, I would caution agencies from implementing identity requirements beyond what is absolutely necessary! The Digital Identity Guidelines from the National Institute for Standards and Technology (NIST) are the baseline that most Federal agencies use; in my personal opinion, they set too high a bar. The government must provide critical services to at-risk and economically disadvantaged groups, and by setting requirements that individuals in these groups cannot meet agencies are not serving people equitably. For instance, the the VA serves Veterans that may be homeless, may not have a credit card, may be partially or fully blind, may have trouble remembering or recalling information, may not have fingerprints, and so on. Since the standard methods of identity verification and authentication may present an impossible barrier for the very people the VA serves, it is in the best interest of these people to not implement NIST’s high standards as written. (And I told NIST the same thing.) If you’re a city or state government interested in a world-class identity solution, I’d recommend reaching out to GSA about Login.gov! Even if you don’t meet the requirement mentioned above, it’s definitely worthwhile to getting in touch with GSA anyway - as we’ve learned, policies change every day.

Read This