I've been spending some time recently to get on top of the dependabot alerts in our repo that have fallen by the wayside a little.
Many of these are often a quick look over the changelog and approve, assuming the tests all pass; there might be some QA involved if anything is particularly spooky-looking or load-bearing. Then some upgrades need a bit of a closer look because the PR bumps several versions, and there are some breaking changes in the API, or the behaviour is now different, and things break in the app, not usually a massive drama.
Before we get into it, I want to give a special thanks to Iain McCoy for doing some of the spelunking on this issue and picked up what was happening in the C side of things.
A new mystery
Something more interesting than I ran into recently came up while upgrading the version of fhir_client
that we're using, from 4.0.3 to 5.0.3. Most of the changes looked relatively innocuous, except our test suite was now failing in a whole bunch of places with the following error:
NoMethodError:undefined method `has_key?' for nil:NilClass
There was no specific place that was causing issues; it seemed to me whenever we were making a request via the client we had set up to hit an external API. Taking a quick trip to the issues for the project turned up an issue that looked suspiciously similar but with no actual outcome at the end of the thread barring some workarounds.
So it seems like I was on my own here. Let's take a look at the full stack trace from the failures:
NoMethodError:undefined method `has_key?' for nil:NilClass# /usr/local/bundle/gems/fhir_models-4.2.1/lib/fhir_models/bootstrap/json.rb:10:in `pretty_generate'# /usr/local/bundle/gems/fhir_models-4.2.1/lib/fhir_models/bootstrap/json.rb:10:in `to_json'# /usr/local/bundle/gems/fhir_client-5.0.3/lib/fhir_client/client.rb:397:in `request_payload'# /usr/local/bundle/gems/fhir_client-5.0.3/lib/fhir_client/client.rb:527:in `post'# /usr/local/bundle/gems/fhir_client-5.0.3/lib/fhir_client/sections/crud.rb:207:in `base_create'# /usr/local/bundle/gems/fhir_client-5.0.3/lib/fhir_client/sections/crud.rb:177:in `create'
Cracking open the fhir_models
gem with code $(bundle show fhir_models)
on the update branch to get a look at the changes, I can see that the module having issues is pretty simple:
require 'JSON'module FHIRmodule JSON...def to_json(opts = nil)JSON.pretty_generate(to_hash, opts)end...end
I could confirm with pry that trying to generate the JSON with to_hash
and opts
(which was nil) was raising that error, and via rails console
that trying to pass other random hashes of data to JSON.pretty_generate
would also throw the same error when opts
was nil, but would be fine if opts
was an empty map.
The mystery arose when looking at the source for JSON.pretty_generate
, where the definition of that method uses nil as the default value of opts
, and the handling should very much take care of a nil value.
Things became a little clearer when trying to do the same test via irb
:
$ irbirb(main):001:0> require 'json'=> trueirb(main):002:0> JSON.pretty_generate({a: 'b'}, nil)=> "{\n \"a\": \"b\"\n}"
No error. I began to wonder if JSON was even the same thing here. My next lead was that there was something off in our app.
A case of mistaken identity
I knew that our app was using Oj for JSON serialisation. But I didn't initially notice anything spooky about pretty_generate
. A colleague of mine (shoutout Iain) had keener eyes and pulled up the C file that dealt with something spooky sounding in mimic_json.c.
Reading up a bit more about the Rails setup for the Oj gem, the documentation suggests adding the following to an initialiser in your app to have the gem "take over" (quotes theirs) many methods of the JSON gem. That info, partnered with the C file, illuminated us to the fact that the OJ.mimic_JSON
method redefines the JSON.pretty_generate
method (among others) with a different implementation, one that either doesn't handle a nil value for opts or changes the handling to be focused on an empty hash.
Taking a look back at our initialisers, I found one for Oj, one where we're calling Oj.optimize_rails
, of which mimic_JSON
is one of the steps.
Moving forward from a drop-in replacement
We find ourselves here because we broadly do a lot of JSON generation as part of our API. We use active_model_serializers
, and the easiest way of getting the benefits of Oj's speedier JSON generation is to let it take over everything in one step making all of your serialisers more performant.
On the other hand, the problem is that you're now rolling with all of those redefined methods any time you use the JSON module, even when you're not a) aware of it or b) intending to use them.
In a happy coincidence, we're knee-deep in the middle of a migration of our serialisers to use Blueprinter instead of AMS. One benefit, in this case, is that it lets you explicitly set what generator you're using, like so:
require 'oj'Blueprinter.configure do |config|config.generator = Oj # default is JSONend
This whole scenario has fast-tracked our plans a little to finish off our migration so that we can remove gotchas like this and unblock our upgrade of the fhir_client
gem that we were looking for at initially.
I always enjoy a bit of a spelunk like this. I'd be keen to hear of any other interesting scenarios that you might have run into that are similar.
Haha think I wrote this as a form of self-care after the fact, but, I found this to be a fun little off-road adventure.
— Anton Ivanopoulos (@aivanopoulos)May 3, 2022
One of our dependency updates was causing some strange JSON-related test failures in our CI, in this post, I follow the thread on why.https://t.co/EmaR78zQmc