GitHub's engineering team has moved to Codespaces

4296 views 0 comments 0 likes

The GitHub.com code base has a history of nearly 14 years. When the first commit on GitHub.com was pushed, Rails was only two years old. AWS is one of them. Azure and GCP do not yet exist. This may not be very long in the COBOL era, but it is quite a lot in the Internet era.

In these 14 years, the core repository supporting GitHub.com (github/github) has received more than one million commits. The vast majority of these submissions are from developers who build and test on macOS

But our development platform is constantly evolving. In the past few months, we have left the macOS model behind and used Codespaces for most of the development of GitHub.com. This is a fundamental change in our daily development process. Therefore, Codespaces products are more powerful and we are ready for the future development of GitHub.com

Status

Over the years, we have invested a lot of time and energy to make local development work out of the box. For some time, our "script to rule" approach has provided engineers with a familiar interface-new employees can clone github/github, run setup and boot scripts, and run a local instance of GitHub.com within half a day. In most cases, everything worked. If it didn’t work, our bootstrap script would open a GitHub issue to connect the new hire with internal support. Our #frictionSlack channel is made up of good and helpful engineers who can debug almost any system configuration

However, despite all our efforts, local development is still fragile. Any seemingly harmless changes can make the local environment useless, and worse, it takes hours of valuable development time to recover. The mysterious damage is so common and disastrous that we wrote an option for our boot script: --nuke-from-orbit. When called, the script will delete as much as possible to try to restore the local environment to the Known good condition.

Of course, this is a classic story that anyone in software engineering will recognize immediately. The local development environment is fragile. Even in the case of perfect operation, the single-context, customized local development environment is becoming more and more incompatible with the instant start-up, accessible world of our current operations.

Collaborating on multiple branches across multiple projects is painful. When a branch introduces new dependencies, releases architecture changes, or branches from a different SHA, we often find ourselves staring at the 45-minute bootloader. Given that our code base changes so quickly (we deploy hundreds of changes every day), this is a common source of engineering friction.

We are not the only ones who noticed-when building Codespaces, we worked with several first-class engineering organizations that built platforms similar to Codespaces to solve these same types of problems. On any major scale, eliminating this type of productivity loss will quickly become a very obvious productivity opportunity

Development Infrastructure

In the field of infrastructure, industry best practices continue to position servers as commodities. The idea is that no server is unique, indispensable or irreplaceable. Any work can be taken out and replaced with a similar work without much fanfare. If the server fails, that's okay! Take it down and replace it with another one.

However, our local development environment is different and each has its own special habits. Therefore, they need almost constant vigilance to maintain. The next git pull or bootstrap may quickly degrade your environment, and when you are more willing to build software, you need to transfer expensive context to recovery work. There is no promise of a warm laptop standby.

But there is a lot to say about development environments as our own environments-they are the environments in which we spend most of our time! We adjust and adjust our workbenches to increase productivity, but also as an expression of ourselves.

With Codespaces, we see an opportunity to treat our development environment like infrastructure—a commodity that we can stir—but still maintain the ability to manage our workbench. Visual Studio Code extensions, settings synchronization, and point file repository bring our environment to our computing. In this case, the damaged workbench is just a small inconvenience-now we can provide a new code space in a known good state and start working again

Adopt code space

Migrating to Codespaces solves the shortcomings of our existing developer environment, motivates us to further promote the product, and provides leverage to improve our overall development experience.

Although our migration story has a happy ending, the first stage of our transition is...challenging. The GitHub.com repository is almost 13 GB on disk; simply cloning the repository takes 20 minutes. Combined with dependency settings, it takes more than 45 minutes to guide a GitHub.com code space. Once we have a warehouse successfully installed in the code domain, the application will not run.

For 14 years, the assumption that macOS is the center of our boot process will have to be revoked.

Overcoming these challenges brings the essence of GitHub. Contributors from across the company helped us re-examine past decisions, question long-standing assumptions, and work at the source code level to separate GitHub development from macOS. Finally, we can (albeit very slowly) provide usable GitHub.com code space on the Linux host, connect from Visual Studio Code, and deliver some work. Now we must figure out how to make this thing buzz.

45 minutes to 5 minutes

Our goal with Codespaces is to adopt a model in which we provide an on-demand development environment for the task at hand (the mapping between branches and code spaces is roughly 1:1.) In order to support task-based workflows, we Need to be as close to instant as possible-as much as possible. 45 minutes will not satisfy our task-based bar, but we can see the fruits at hand, mature potential optimization.

First: change the way Codespaces clones github/github. Codespaces will now perform shallow clones instead of full clones at configuration time, and then perform non-shallow repository history in the background after creating the code space with the latest commit. Doing so reduces the cloning time from 20 minutes to 90 seconds.

Our next opportunity: Cache software and service networks that support GitHub.com, including traditional Gemfile-based dependencies and services written in C, Go, and custom-built Ruby. The solution is a GitHub Action that will run nightly, clone the repository, bootstrap dependencies, and build and push the resulting Docker image. Then use the published image as the basic image in the devcontainer of github/github-the configuration of the code space environment is the code. Our code space will now be created at 95%+ boot time.

These two changes and a small number of application and service level optimizations reduced the GitHub.com code space creation time from 45 minutes to 5 minutes. But five minutes is still a long way from "ready to use". Well-known research shows that people can maintain a waiting time of about 10 seconds before losing flow. So although we have made great progress, we still have a long way to go

5 minutes to 10 seconds

Although five minutes represents a significant improvement, these changes involve trade-offs and imply a more general product demand.

Our shallow cloning method-useful for quick boot to code space-still requires us to pay the cost of full cloning at some point. Load generated after non-shallow layer creation with distracting side effects. Any large and complex project will face similar problems during this period, cloning and booting will cause contention for available resources.

What if we can clone and boot the repository in advance so that we have done most of the work when the engineer requests code space?

Enter the pre-built: code space pool, fully clone and boot, waiting to contact the developers who want to start working. Our engineering investment in pre-build has repeatedly paid off its value: we can now create a reliable, pre-configured code space and be ready for GitHub.com development within 10 seconds.

Compared with installing Slack, new employees can enter the normal operating development environment from scratch in a shorter time. Engineers can separate new code space for parallel workflows without overhead. When the environment collapses-it may be too backward, or the test data breaks something-our engineers can quickly create a new environment and continue their day

Increase leverage

Switching to Codespaces solved some very real problems for us: it eliminated the fragility and single-track model of the local development environment, but it also provided us with a powerful new leverage point to improve the developer experience on GitHub.

We now have a wedge to perform additional setup and optimization work that we have never considered in the local environment. The cost (time and patience) of these optimizations is too high. For example, with pre-build, we now prepare our language server cache and gem documents, run pending database migrations, and enable GitHub.com and GitHub Enterprise development models-this task usually requires another cycle through guidance and settings .

Using Codespaces, we can upgrade the machine specifications of each engineer with one configuration change. In the early stages of the code space migration, we used a VM with 8 cores and 16 GB RAM. These machines are enough, but GitHub.com runs a network of different services and is happy to consume every core and every part of RAM we are willing to provide. So we turned to a virtual machine with 32 cores and 64 GB RAM. By changing a line of configuration, we upgraded each engineer's machine.

Codespaces also started to steal business from our internal "review lab" platform-a production-like environment where we can preview changes with internal partners. Before Codespaces, GitHub engineers needed to submit and deploy to review lab instances (which usually required peer review) in order to share their work with colleagues. Friction. Now we ctrl+click to grab the preview URL and send it to colleagues. No submission, no push, no review, no deployment-just view port 80 in real time on my code space

Command line

Visual Studio Code is great. It is the main tool used by GitHub.com engineers to interact with the code space. But requiring our Vim and Emacs users to use graphical editors is not so good. If Codespaces is our future, we must bring everyone with us.

Happily, we can support shell-based colleagues by simply updating our pre-built image, which uses our GitHub public key for initialization, opens port 22, and forwards the port out of the code space.

From there, GitHub engineers can run Vim, Emacs, or ed, if they want.

This works very well! And, just like how Docker image caching leads to pre-builds, the next step is obviously to take what we have done for the GitHub.com code space and make it a first-class experience in every code space

Likes(0) Reward

Article category：Tech Headlines
Article tags：科技头条 Github
Article views：4296 Views
Post date：2021-08-22 01:01:13
Article url：http://elephdev.com/tech-headlines/260.html

GitHub's engineering team has moved to Codespaces

Status

Development Infrastructure