State Your Destination Posts

I recently wanted to set up Did I Hike That? to allow for a demo user that could play with the app but not affect any existing data. This required that I know who created each hike record, which I didn’t include in the initial design. I realize now that was a bit shortsighted. But no worries, that’s what database migrations are for!

I use Sequelize as the ORM for this app and have already done one migration on the database, so this would just be a new one. I originally used the Umzug library to do things programmatically since I didn’t want to mess with calling the CLI. The new migration script needed to add a new column to the hike table and then update that value for every hike. Sounded easy, but there was a bit of nuance.

I wanted everything to run inside a transaction so if something went sideways, the database would be left untouched. But I also needed the new column to be present inside the transaction so I could update it later in the script, which was a bit of a catch-22. This project is using Sequelize 6, which doesn’t allow nested transactions (it seems v7 will support them). In v6 there are a couple ways to execute a transaction:

  • Managed, where you call the built-in transaction method and pass it a callback. If an exception is thrown inside that function, the changes are rolled back, otherwise they are committed.
  • Un-managed, where you call the transaction method and use the object it returns to manually commit or roll back changes by calling an appropriate method.

The previous migration script used the un-managed approach because I wanted maximum flexibility on when things happened. But when I ran the new migration script there were a couple of problems. The entire up function is in a try/catch, but the column change would persist even if an exception happened somewhere (the raw query of the hike table had some minor issues as I was writing it that would produce an error). Once I got through that it got down to the point of updating the new column, but kept throwing a column not found error.

I don’t remember if I knew this before, but after some web searching and re-reading the docs I realized you can include that transaction object in any call of the query function on the main sequelize object. The docs say having that object will cause Sequelize to create a save point for whatever query you are executing. It won’t commit it until you call commit but it will remember the change for later queries which include the same transaction object. Which is exactly what I needed. The end result looked something like this:

const transaction = await queryInterface.sequelize.transaction();

try {
    const hikeTableDefinition = await queryInterface.describeTable('hikes');

    if (!hikeTableDefinition.userId) {
        // Including the transaction here will make Sequelize aware of the new column so we can update it later
        await queryInterface.addColumn('hikes', 'userId', { type: DataTypes.STRING }, { transaction });
    }

    ...

    await queryInterface.sequelize.query("Update hikes Set userId = 'some_user_id' Where id = 'some_hike_id'", { transaction });

    await transaction.commit();
catch (error) {
    await transaction.rollback();
    throw error;
}

The two queries that modify the database include the top-level transaction so they are both aware of any changes being made. If the hike table update blows up for some reason it will roll back everything, which was perfect since I was looping through all the hike records and wanted it to be all or nothing. I added the check for the existence of the new column just to make doubly-sure it doesn’t throw unnecessarily, since trying to add a column that’s already there will definitely throw.

This is part of a series of posts on learning Python and building a Python web service.

Choice of Framework – Part 2

My first goal was to set up two endpoints, one for searching for an artist/album/song and one for lookup of a particular entity, to the point where I could send data in and receive some back. The next goal was to be able to deploy the service to the cloud (Azure) and have everything still work as expected. Once that was done I would begin the process of fetching and returning music metadata. For now it was time to dig into how APIFlask does things.

Python has good support for decorators, and APIFlask takes advantage of them to define the input, output, and HTTP operation for an endpoint. In this way it’s similar to libraries I’ve used with Express that augment how you define endpoints. The decorator for input lets you define the names of the API parameters, and then APIFlask will populate matching arguments to the function that will respond to the given request. You can also specify validation rules and defaults that let you avoid writing code to do any manual checking. I’m using those to set defaults for pagination parameters on the search and making sure search text is included as a query string parameter. Note that the input decorator is optional, and you can use more than one on a function. All of this is managed by defining a schema for the input. Here is the definition for the search endpoint and its schemas:

@app.get('/search/<string:entity_type>')
@app.input(SearchParameters, location='query')
@app.output(SearchOutput)
def search(entity_type, query_data):
    ...

class SearchParameters(Schema):
    page = Integer(load_default=1)
    pageSize = Integer(load_default=10, validate=OneOf([10, 25]))
    query = String(required=True, metadata={'description': 'The search text'})

class SearchOutput(Schema):
    rows = List(Nested(SearchResult()))
    count = Integer()

The tricky part there was the endpoint had a path parameter and query string parameters, and at first I couldn’t figure out exactly how to configure all of that. The path parameter is easy, the name can go in the @get endpoint path and it will be matched with the first argument to the search function. The @input decorator specified the query string parameters via the name of the schema class for them. It apparently puts all the values into a single object rather than into multiple variables. I called the argument query_data but I don’t think it matters what you name it, since it will always be the second argument to the function.

I had tried adding two @inputs but couldn’t quite get it to work. You are supposed to be able to specify the location as path , but I couldn’t see how you would tie that to the part of the path the represents the value. The schemas are fairly straightforward class definitions and APIFlask passes them to Marshmallow under the hood. They are mostly mapped to the shape of the data that MusicBrainz returns, though they are generic enough that they should work with other providers. One twist is any class field that is itself an object needed to be wrapped in the Nested class. In the end I was able to define everything in a way that worked and kept the framework happy.

Here are the output schemas for the lookup endpoint:

class Album(Schema):
    id = String()
    name = String()
    artist = String()
    release_date = String()
    description = String()
    tags = List(String())
    image = Nested(Image())
    links = List(String())

class Artist(Schema):
    id = String()
    name = String()
    description = String()
    life_span = Dict()
    area = Dict()
    begin_area = Dict()
    tags = List(String())
    images = List(Nested(Image()))
    albums = List(Nested(Album()))
    members = List(Nested(BandMember()))
    links = List(String())

To test the request/response cycle I simply included the inputs in the JSON response. I tested the endpoints using the same tool I’ve been using for a while now, and that’s SoapUI (a.k.a. ReadyAPI). I have the open source edition which is not the easiest thing in the world to use, but it does the job (probably Postman would be better, but that would have been yet another thing to learn).

Final review for APIFlask: thumbs up.

Despite the documentation not having an example for my exact situation, it was decent enough to get things going. I like the ability to set things so the framework does some of the work for you. It also includes stuff like some nice built-in error handling and auto-generating Swagger UI docs.

This is part of a series of posts on learning Python and building a Python web service.

Scaffolding

Not knowing the best way to structure a Python web service project, I went searching on the web for good scaffolding examples for a flask app. I found a few and decided to go with this one. It seemed to lay a good foundation and could be customized as needed. Little did I know…

I ran into a couple of issues. The first was one of the dependencies of a dependency in this repo didn’t work with Python 3.13, in the version that was being installed. The maintainers had put out an update to fix it but the downstream package hadn’t started using it yet. Python 3.13 had just been released a few weeks earlier, which I naturally installed since it was the latest. That’s annoying but understandable.

Sidebar on how to install Python dependencies:
As with many things in the Python world, there are multiple ways you can manage project dependencies. The easiest way is to use pip. Like npm, you usually install it globally and then you can use it in every project to install packages. The way to do all of them at once is to list the packages you need and the version you want in a separate requirements.txt file, and then tell pip to install everything specified in that file. Another way is to use a tool called Poetry. It looks for a file named pyproject.toml that has a list of packages/versions plus general info on how to manage the project’s dependencies. It’s also more modern and full-featured than pip.

The scaffolding repo used Poetry and the fix for my first issue was to include the updated library that was the source of the problem as a special requirement before all other packages were installed. It took some research on the specs for the pyproject.toml file and trial and error to figure out the right entry to make, but that first error message eventually went away and all the packages got successfully installed. For the curious, the entry looked like this:

[tool.poetry.dependencies]
python = "^3.12"
# Needed for Python 3.13
greenlet = "^3.1.1"

The next issue I hit was more damaging. The scaffolding expected to use a particular HTTP server named gunicorn. There’s just one problem: it doesn’t run on Windows (kinda of lame, since IIS, Apache, and the various Node-based HTTP listeners I’ve used over the years all run fine on that OS). So trying to run my project based on this scaffolding blew up spectacularly.

It was at that point I decided to go back to square one and try to simplify things. I ended up starting with the new project template in PyCharm, and just adding an app.py file and __init__.py file. I copied over what was useful from my prior project, and augmented that with various examples I found via Google. It’s similar to the approach I took when creating Node web services for personal projects. I also wanted to find a good canonical example of scaffolding but I never did. I just ended up starting small and expanding from there. The problem is these tech stacks seem to change so often, and there are so many options, it would be hard to maintain a solid example over time. When I needed to spin up new Node apps I simply copied what I had done previously and modified the project as I added features or learned new Node best practices.

I eventually got a bare bones Python service running properly on my machine. I’m sure I’ll be adding to the project and re-arranging lots of things as I go.

This is part of a series of posts on learning Python and building a Python web service.

Environment

Python has a similarity to Node.js: there are different versions of the language/runtime and not all apps will work with every release. And sometimes you need to develop against a different version, but it’s not possible to have multiple versions active simultaneously. For Node there are tools like nvm or nvm-windows that solve this problem, allowing you to switch which version of Node is active at any time.

Python devs run into this a lot. Libraries or tools may not work with a new version of the language. Or you have to maintain an app that only works with an older version, but you also need to write new code using the current version. The suggested way to handle this is to create virtual environments in each project directory. Each environment includes a directory that contains the version of the Python interpreter you want, plus the dependencies for just that project, plus assorted helper scripts/tools. A different project in a different directory can have its own virtual environment with all different stuff. There is also a tool for Windows named pyenv-win that helps manage having different version of Python installed on the same machine, though I decided not to use it for now since I didn’t need it.

So that all made sense and I planned to create one for the Music Browser API. But again, that lead to the question of which tool to use to create one (another Node similarity: a dearth of tools available to do any particular thing). Python itself includes a module to create these environments named venv. It does the job but some brief web searching suggested other, better choices. I eventually picked virtualenv since it had good features and made things easy. It works great, though I ended up building, destroying, and re-building the environment for the Music Browser API over and over, as various things I was trying went wrong (more on that here).

Postscript:
An interesting thing I learned later is virtulenv comes bundled with PyCharm, which is the IDE I’m using. And when you create a new Python app it uses it to create a virtual environment for you. So running virtualenv by itself isn’t even necessary, though it’s good to know how to do it.

I recently decided to expand my language skills and learn Python properly. I had dabbled in it at a previous job, but that was long ago and very little of that knowledge has survived in my head. I figured an online course of some sort was the best way to go. I bought a Python boot camp to learn the basics of the language, but it was going slower than I wanted. It’s geared for absolute beginners and I now realize I should have tried to find one suited for experienced developers. In any case, I decided to start setting up an actual working codebase for what I want to write: the new back end web service for the Music Browser.

Which brings us to this post: the first in a series of things I’ve learned about the language, the Python toolchain, how stuff works, and how to do things the right way. Obviously for experienced Python devs none of this is new. But I figured having a reference to read later would be useful. I plan to add more entries as I go. Here’s the first topic:

Choice of Framework

My goal was to create a RESTful web service, so my first question was how is that done in modern Python. There are currently two main types of framework to use: one that is WSGI-based or ASGI-based. It’s job is to be a bridge between a Python service and the underlying HTTP server. The first one is the legacy standard that has been around for a while. It’s biggest drawback is it responds to HTTP requests synchronously. The second one is the modern version that does the same thing and is backwards-compatible, but can respond to requests asynchronously, providing better performance. It also includes other features that are useful in building full-featured services.

The most popular WSGI framework is flask. It’s battle-tested, has a lot of documentation and examples, and is simple to set up and use. For an ASGI framework there are several choices, such as FastAPI. I went back and forth over which I should use. Part of me wanted to pick flask since it’s easier to configure out of the box and I could spin up something quickly. But another part of me wanted to go with the current thing, since I should try to learn the future and not the past. Plus the idea of having async support by default was appealing.

In the end flask won out. I wanted to get some working code going ASAP, and I was worried I’d spend a ton of time learning the ins and outs of something complex like FastAPI before being able to get an app off the ground. All the docs I read said flask was dead easy to get started.

After that decision came a related next question: what form of flask should I use? Turns out there is no-frills flask, but also other libraries that are extensions to it or drop in replacements for it, which offer a lot of niceties that make writing APIs easier. I looked at a couple and decided on APIFlask. I had a working flask setup with two basic endpoints already, and impulsively started re-factoring things after browsing the APIFlask web site for like 5 minutes (note to self: a bit more deliberation on these kinds of things is worth it. You never know how far you might get in the new shiny thing before realizing it’s just not going to work for you, and I had a brief scare about not being able to figure out how to do a particular thing I wanted to do).

AI is everywhere, sadly. It’s really an indictment of the computer technology space that as soon as cryptocurrency was effectively exposed for the fraud and ridiculous concept that it is, the capital overlords decided to pivot to generative AI. Which is equally ridiculous in the sense that nobody asked for it, it’s not something that solves any widespread problem we as a society have, and it has numerous shortcomings (and it’s not even remotely close to intelligence). Yet it’s causing corporations to throw billions of dollars as fast as humanly possible into developing the technology for mass use.

Things came to a head for me personally in recent days, when Google rolled out their AI Overview feature in search. And it’s lame. And it’s not something you can turn off! But they did do a nice thing and gave us… udm=14.

I’ve been using the web for a really long time, and I remember when Google search first appeared and how magical (no joke) it was to use. Over time that experience has changed and not always for the better. So when Google decided it was search’s time to get the AI injection, it made me very sad, especially since there was no way to disable it. Thankfully they allowed users to go back to a simpler time when search was for finding web sites and not for trying to answer complex questions using data of dubious accuracy. Just by adding a query parameter to any search URL. And it’s amazing. It has been so long since search results looked like this that I had forgotten how nice it was.

Nothing but links! No cruft, no nonsense, no AI garbage in sight. It’s glorious. Hopefully that query parameter will be around for a long, long time. It’s so popular that searching for ‘udm=14’ brings up a ton of articles about it. The best place to learn how to use it is the appropriately-named udm14.com

I recently came across this interesting article from Cloudflare about how its DNSSEC public keys are signed by its private key, and the ceremony that is undertaken to do that every few months to secure its root DNS zone. It’s fascinating to me because few people know what DNS is and how it is vital to the Internet as we know it.

The process is straight out of the first Mission Impossible movie. But it’s heartening to know that so much effort goes into something that computers all over the globe trust every day for their operation.

I’m a huge fan of the Material-UI component library for React. I’ve used it in professional and personal projects for many years now, and it’s super sweet. Why is it so great?

Two things:

  • A wide variety of widgets that cover just about any UI you would want in a web site
  • Excellent documentation

There is also a large community of users so lots of good Stack Overflow questions and answers exist. Occasionally I’ve had to try and make their components do a new bit of magic, and it can be tough to wade through all that knowledge to determine if a specific use case will work. Recent case in point: figuring out how to replicate features of the Gmail To field.

My first step was considering using a chip input library for MUI I found earlier that would add chips to a standard MUI TextInput component. But it appeared that library was not really being maintained any more, and under the hood it used the pre-hook React lifecycle. So maybe not the best choice for new feature work. But I lucked out: there was an open issue in that repo that linked to an example that basically did what I wanted and bonus, it was in TypeScript.

This code sample used the MUI Autocomplete component, which is a solid way to get input from a list of preset options that supports type-ahead. Or it can be completely free-form where you can type anything you want in addition to a preset value. You can also configure it to allow inputting multiple values and then render special markup for those entries, as well as rendering whatever input component you want. All well and good. I was going to use it as the CC field on a page for sending email, and I had certain requirements:

  • Make sure each entry is a properly-formatted email address
  • Make sure the domain wasn’t in a blocklist
  • Do these validations when the user hit the Space key after typing an address

The first two were easy: we already had a regex to check the format of an address that worked for our purposes, plus code elsewhere to check if an address domain was in our blocklist. The tricky part was the Space key. The Autocomplete component has two props for its value: value and inputValue, which are tracked separately. In my configuration the former will have the list of email addresses, and the latter will have the value of the <input> element (in my implementation the list of addresses was actually a list of objects, each with the address and a validity flag).

To capture when the user inputs something, I handle the onChange event. The trigger for it is hitting the Enter key, which is good since I want users to be able to press that key to check the address they just typed. In order to capture any other keystrokes, I needed to handle the onKeydown event for the underlying TextField. The tricky part was telling it to use my custom handler.

The example I was following had a function for the renderInput prop. That function returned a TextField for use as the input portion. It included whatever props that were passed to renderInput, spreading the given object on to the TextField instance. It wasn’t readily apparent that any props were actually being set. Instead I spread my own object that included my key down event handler. And suddenly it became very broken, for a good reason: there were some props that Autocomplete needed to assign to the TextField and I was completely leaving them out.

The solution was to add my handler to the incoming object holding the required props, and assign that to the TextField, like so:

renderInput={(params) => {
    const newParms = {...params};

    newParams.onKeyDown = (event: React.KeyboardEvent<HTMLInputElement>) => {
        // Check for the Space key and handle it as needed
    };

    return (
        <TextField
            {...newParams}
            label="CC"
            placeholder="Enter email address"
        />
    );
}};

After that the regular Autocomplete features all worked and my Space key handler functioned as expected.

It was a bit over 10 years ago that I first wrote about my efforts to create a better version of a long-dead native iOS app that Mozilla wrote to dip their toes into the iOS pool. All their app did was give you access to certain data in your Sync account: bookmarks, history, open tabs, etc. It was perfect for me because I had an iPhone and lots of Firefox bookmarks, having used the browser since 2006.

My desire was to write a web app that worked the same but avoided a really dumb bug in their app, and also learn some new technologies while I looked for a new gig (I had just moved to Seattle, WA). I reached into my depths of creativity and called it the Bookmark Browser. It was a tremendous success. I still use the app today, over 10 years later. But it’s changed a lot in that time, evolving to use newer tools and work with changes to Firefox services. Follow along as I catch up on what’s changed since the last revisit.

Sidebar: I want to first tip my hat to Valérian Galliat who in mid-2021 did a deep dive on his efforts to do the same thing the Bookmark Browser does, which is access data in Firefox Sync from an application that’s not Firefox. I wasn’t able to take advantage of the code he wrote, but it’s a detailed and entertaining journey into what it takes to build a third-party app on top of the Firefox ecosystem (hint: far, far too much). It also inspired me to re-visit my original back end code and see if I could get it working again.

The original 2012 Bookmark Browser implementation was a front end built on jQuery Mobile + KnockoutJS with a back end written in C#. That back end leveraged a client library written to do the heavy cryptographic lifting of getting data out of Sync. Life was grand until 2016, which is when authenticating against Firefox services started becoming cumbersome. Previously my code could log into my Sync account and pull down data immediately, with no other interaction needed. Mozilla was in the process of improving their services and hardening things around the login API. I ended up having to code around those restrictions (e.g. needing a second step in the process to verify a login). The good news was it still worked.

Also in 2016 I started using AngularJS in my day job and really liked it. In early 2018 I decided to convert the front end to that framework. It was also around this time that the Sync access just stopped working. I’d get an ‘Unauthorized’ error when trying to call their login API and couldn’t figure out a way past it. So I switched the back end to accept an upload of a JSON bookmark backup made from the desktop version of Firefox, and an endpoint to fetch that data. It was annoying to have to go through the extra step of uploading a backup whenever I wanted to refresh the bookmark data stored on my phone. But it was worth it to have the app work consistently.

In December of 2020 I decided another tech stack update was in order, because AngularJS was nearly dead and I wanted to switch hosting providers. I re-wrote the front end in React which I had been using for over a year and a half in my day job. The back end was the same C# web service that accepted a bookmark backup file, using the latest 4.x version of .NET. I still prioritized certainty of operation over the convenience of being able to get data directly from Sync. The fun part of this upgrade was figuring out how to have front end components communicate with each other, which was very easy in AngularJS. I decided to use the now-much-improved Context API and the various hooks React now offers. I replicated all functionality and resolve some nagging CSS issues along the way. It was a great improvement. I then moved everything to Azure for hosting in a Windows VM, and Azure DevOps for CI/CD (the VM also hosted this blog and therefore ran the WordPress stack).

The app worked great but as time went on I really, really wanted to figure out how to talk to Sync directly again. Then in early 2022 I found Valérian’s blog series on doing exactly what I wanted to do, purely by chance after a random Google search. As luck would have it, at that time I was considering porting the back end to Node.js, since I wanted to ditch my Azure VM and set everything up in an App Service based on Linux. And he had some JavaScript code! Sadly, when I tried to use it I ran into an error I couldn’t explain while trying to authenticate. But he said he had gotten it to work, which suggested it was possible. I dusted off my original C# web service, updated it to a .NET Core 6 project, and compared that code to the JavaScript Valérian had written to authenticate and access the encrypted contents of Sync storage. I really only found one important difference: a reason identifier for the initial login request.

The first step in the crypto-dance to get data out of Sync using Mozilla’s original BrowserID protocol is to make a login request using your email address, Firefox Accounts password, a verification method, and a reason for the request. In my login API that request boils down to the following, using some helper code for the mechanics of making an outbound request and a class for this particular request:

public class LoginRequest
    {
        public LoginRequest(Credentials credentials)
        {
            this.Email = credentials.Email;
            this.AuthPW = BinaryHelper.ToHexString(credentials.AuthPW);
            this.VerificationMethod = "email";
            this.Reason = credentials.Reason;
        }

        [DataMember(Name="email")]
        public string Email { get; private set; }

        [DataMember(Name="authPW")]
        public string AuthPW  { get; private set; }

        [DataMember(Name="verificationMethod")]
        public string VerificationMethod { get; private set; }

        [DataMember(Name="reason")]
        public string Reason { get; private set; }
    }

Post<LoginRequest, LoginResponse>("account/login" + (keys ? "?keys=true" : ""), loginRequest);

In the original Sync client I used, the LoginRequest class only had the email address and password. And I think what happened is at some point the other two values became required, and Mozilla didn’t really go out of its way to tell third-party developers who maintained apps that authenticated to Sync via code. But Valérian noticed it and his implementation passes in all those pieces of data. So I simply added those two properties to the class and it worked. Sort of. I had to go through some trial and error because apparently there are a couple different verification methods you can use. The reason value always needs to be ‘login’, which avoids any immediate need for verification. The verificationMethod value can be one of these supported values:

  • email
    • Sends an email with a confirmation link.
  • email-2fa
    • Sends an email with a confirmation code.
  • email-captcha
    • Sends an email with an unblock code.

I tried email-2fa and email-captcha but couldn’t get either one to work. I don’t know why, but in the end it didn’t matter. The process now has three steps: step one is making a login request, step two is clicking a link in the email you get to verify the login attempt, and basically establish a valid login session to the Firefox Accounts services, and step three is getting bookmark data. The legacy BrowserID protocol that all of this code uses basically requires a new set of keys be generated each time you want to make requests to Sync storage, because they are very short-lived. So on step three I have to make the same login request, but this time it will pass with flying colors and I simply get an email warning that a successful login was made using your credentials, make sure it was you, etc.

I love J.K. Simmons

As of today, Mozilla still supports their legacy BrowserID protocol and infrastructure for authenticating an account and using their various Sync services. This is very nice of them. But I’m under no illusion that any of it is permanent, and that it could all disappear tomorrow. But I don’t think that’s likely, since my guess is part of Firefox or other Mozilla apps/tools still use that protocol. And so they will support it for a while still. As long as they don’t change it I’ll be able to get bookmark data out of my account and into Bookmark Browser every time. And even if they did finally pull the plug on the BrowserID stuff, I left in the bookmark backup code so I can fall back to that.

Postscript:
Mozilla does have a mature OAuth implementation that they put together a while ago, which is obviously a better way to authenticate. But it assumes you have a valid client ID for the initial auth challenge, and that they know the URL to redirect users to once they’ve obtained a token. Apparently they want you to ‘consult’ with them to set it all up, which sounds rather unappealing. As Valérian mentioned and I agree with, Mozilla doesn’t seem to want to make it easy for developers to build apps on top of the Sync ecosystem. Which is sad because a lot of cool things could probably be built.

Update:
Turns out ‘a while still’ means up to May of 2024: Mozilla officially started decommissioning an essential piece of the BrowserID protocol. I found this out by trying to update the bookmarks on my phone and getting a strange 404. The endpoint that does cryptographic certificate signing as part of the authentication process was slowly being restricted. It’s now basically gone and so I’m left with my fallback of uploading a Firefox bookmark backup JSON file. When I first learned what was happening I actually considered reaching out to them to see about getting an OAuth setup in place for my app. I haven’t yet but it might be worth a shot. The worst they could do is say No.

Not that Create React App was ever bad. But when I started working with React back in 2019 I was not a fan of the idea that all the plumbing for your app would be completely hidden from view. I was coming from working with a production AngularJS front end, where I knew every inch of our Gulp build script, all the build configuration, and all the dependencies we used. It was essential to have this knowledge in order to enhance the build process or add new libraries in such a way as to not break anything.

But CRA takes a very different approach. You see none of that complexity. On the one hand it’s a relief since modern front end development is a minefield of OSS libraries, ever-changing language standards, and interdependent code that when combined is really hard to understand. Not having to worry about that has enormous mental benefits. On the other hand, CRA chooses a specific set of things to support. If you want to use something different, which is one of the primary selling points of the React universe, you are out of luck. You can break free of the CRA womb and use whatever tools you like via ejection, but as the wise man is oft quoted, ‘Stuff can break’.

But I didn’t care about that. I reasoned that having maximum flexibility is good, and if you are maintaining an app using a bunch of different tools you should know how those tools work. That way you can fix the engine when the car breaks down on the side of the road. And so with this sentiment I ejected all the CRA-based apps I created, both at my job and those that were personal projects. Part of me was worried about how to deal with maintaining the dependencies, but others had gone the same route and they seemed to have ways of handling it.

I recently had a change of heart and it started when CRA 5 was released. It included improvements to fast refresh, the latest version of Webpack, unpinned dependencies, etc. All good stuff. Around this time I had wanted to update a handful of personal React projects to the most current libraries. They were all ejected, and the the sheer number of dependencies to potentially have to update was more than a little daunting. Plus I wanted it done fast (and was lazy). So I thought about the main reasons I had ejected in the first place:

  • I had wanted to use LESS, not SASS
  • I did feel like knowing exactly what went into my build process was beneficial

I first started using LESS in 2016 and was happy with it, but in recent years I’ve had to use SASS at my job. And I’ve decided they aren’t really different enough for me to worry about. So reason number 1 doesn’t really apply any more. As for reason number 2, I finally realized what the CRA team was really getting at: they offer a curated set of libraries that you can assume are going to work fine together, and because there are so many nowadays, not having to deal with them all is clearly the better option. There will always be library updates, and having a known good set is a nice way of making sure you don’t get sucked into something that blows up your app.

So to do my personal upgrades I simply created new CRA projects in separate directories, copied over my source, made any necessary adjustments so they could build, then copied all that back over into the original repos. It was painless and allowed me to take advantage of the new hotness.

Update:
I didn’t learn this until summer of 2024, but it appears CRA essentially became deprecated in 2023. It is no longer being officially maintained in its current form, and is generally not recommended for new React-based projects. There are other options around so I’ll have to eventually pick one and switch to it.