08 May 2014
git SHAmend! a quick way to amend changes into an older commit
When I work on projects with slow test suites, my workflow often ends up looking sort of like this: I make some changes on a branch, run the tests that seem likely to be relevant locally, and then push the branch off to tddium (or whatever) to see if any of the other tests fail unexpectedly.
I like each of my commits to be clean and green in isolation, to make digging through project history easier in the future. To that end, when I make a bunch of commits before running all my tests, I often end up fixing bugs and using interactive rebase to merge the fixes into earlier commits before merging my branch in. (So long as I’m the only person working on that branch, at least!)
It’s a reasonable enough process - stage my changes for the fix, commit them as a WIP, then open up interactive rebase to amend them as a fixup to the right commit from earlier. That’s so many steps, though! And I’m so lazy! I really just want to be able to run something like git shamend SHA_FOR_EARLIER_COMMIT instead. (Or better yet, git-smend or git-sm for short!)
So, I wrote git-shamend to solve this problem for me. The full script is available here - just copy it to /usr/local/bin (or wherever you prefer to keep such things) and you’ll be able to use it with git shamend SHA_TO_AMEND.
I initially planned on writing SHAmend! using git’s low-level (“plumbing”) commands, but all my twitter buddies told me I shouldn’t feel guilty about building on top of porcelain (git’s high-level commands) instead if I wanted to.
The meat of how it works is like this:
First, we get the SHA for the reference you pass in. This avoids problems that could come up if you pass in something like HEAD^, whose meaning changes whenever you add a new commit (as we do later on).
If the reference you pass in is a commit that’s in your current branch…
…then git-shamend commits your staged changes, marked as a fixup (which is an amendment that retains the original commit message) to that earlier commit…
…and if you have any remaining unstaged changes…
…stashes them so they don’t interfere with the upcoming rebase…
…and runs an interactive rebase automatically to get that fixup amended properly to the earlier commit you specified.
Wait, it runs an interactive rebase automatically? That sounds kinda weird. It’s interactive, but it’s not!
Git uses the environment variable $GIT_EDITOR to figure out which editor open up to allow you to move commits around when running an interactive rebase. Setting that to “true” here causes git to use true as your editor for this command, where true is just a tiny unix program that does nothing except exit with a successful exit code.
So from git’s perspective, it runs an interactive rebase and opens up an editor for you to move around commits, which ‘you’ close successfully. Great, that’s all git-rebase needs from, er, ‘you’ - assuming there are no conflicts, it can handle the rest on its own!
This works because the -\-autosquash flag tells git-rebase to put fixup commits in the right place for you before opening up the editor, so there’s really nothing you need to do to get things sorted out right.
From your perspective, git just kinda does its thing without bothering you. Gotta love that.
But what if something does go wrong? If the rebase exits unsuccessfully (that is, with a non-zero exit code)…
…then git-shamend aborts the rebase and resets that fixup commit it created earlier, to clean up after itself. (And echos a warning, natch. All that stuff is in the actual git-shamend script.)
And at the very end, if you did have any unstaged changes that were stashed earlier…
…they’re popped from the stash, to make return your working directory to its pre-SHAmend!ing state.
Now I can smend smend smend as much as I want, way more efficiently!
02 May 2014
Finding words that sound alike but are spelled wildly differently
I’ve been working on search stuff lately, and we needed some wordlists to help test search results that match only because they sound similar to the query, and not because they’re spelled similarly.
Turns out we couldn’t find a pre-existing wordlist of homophones (words that sound the same but are spelled differently) that are dramatically different in spelling. And our QA team especially wanted some examples of people’s names that meet those criteria.
So, sure, I figured that’d be fun and quick to throw together for them!
It’s a lot like finding anagrams - the basic structure was a dict (a hash map, for the non-Python folks reading this) keyed by the phonetic encoding of each word. Each key pointed to a nested dict, which included an array of words which phonetically matched the key and a bool indicating whether it fit my criteria or not. In the end, all matching words were spit into stdout as a list of comma-separated homophones.
I determined whether words were spelled differently enough by checking whether a small enough percentage of their trigrams were the same. (I also had a minumum length set, so I’d be sure to have enough trigrams per word to be worth checking for match percentage.
(It was kinda neat to find something that felt more like an interview puzzle than anything else, but was actually useful for my day job. Oh hey, look, those skills are occasionally actually useful! Now you don’t have to feel weird about all the time you spent learning how to solve these sorts of puzzles!)
Sweet and simple and fun! Here’s my script and a few of the wordlists I created with it, since I figure other people may also find this sort of thing useful when testing search implementations. (FYI, if you’re using something other than a metaphone/doublemetaphone soundalike algorithm and trigrams for misspellings, you may want to make some adjustments.)
24 Apr 2014
My new favorite vim/tmux bug
This week, I'm grateful that my coworkers know to come grab me if something seriously weird is going on, because it fills me with so much glee! I mean, WHAT.
minimal repro:
On Suffolk (one of our machines), open tmux, open vim, open new terminal tab.
Vim gets “lililililililill” inserted in current file, and beeps a lot
If the file already has content, it prepends i and appends ll to ~10 lines, and sometimes capitalizes something
WTF WTF WTF THANK YOU
I'm going to skim over some of the details so that this remains a blog post and not an endless excited ramble, but! This is approximately how figuring this nonsense out went!
Initial poking around
When does the problem happen? When you open a new bash tab or window, or enter any command in any bash session.
"lililililililill" looks very suspicious. Is that a macro or something hiding in one of the vim registers? Use :reg to check the contents of the vim registers - nope, nothing fishy in there!
Is there anything funky in our tmux config? ~/.tmux.conf doesn't exist, and a quick googling around didn't turn up anything on any other standard sorts of tmux config files. Fair enough, put that aside for the moment.
Poked around a bit more to define the edges of the problem:
- Happens both in iterm and terminal
- Happens after restarting vim, restarting tmux, reinstalling tmux, reimaging Suffolk
- Does not happen in vim/macvim outside of tmux.
- Does not happen in any other of the handful of machines that were checked.
- Does not happen in vim in tmux when ssh'd into another machine.
</ul>
(One exception to that last one - a coworker said he was able to replicate it when ssh'd into a remote coworker's machine. But when we tried to replicate that, it didn't happen. An isolated datum, potentially relevant, but highly suspect. To this day, I'm pretty convinced that folks got mixed up and it never really happened in the first place - happens to the best of us, and it doesn't fit with any of the other evidence.)
Cool, we got the lay of the land. So! What changed recently?
Ah, this machine was newly reimaged. Maybe we have new broken or incompatible versions of some things?
We were told that the tmux version should be frozen as part of our install script. Do you believe everything you're told? No? Good! You guessed it, we had totally different versions of vim, tmux, bash, and OS X on this machine than on other machines which do not exhibit the same problem.
Around this time I started up a Google doc to keep track of everything we were trying, because once things start to look complicated I know I won't be able to remember everything I've tried. Especially when multiple people are involved! And it's a horrible waste of time to repeat experiments out of forgetfulness, or even worse, lose potentially relevant data. I won't bore you with a full list of versions and reinstallation steps, but boy do I have all the details in my notes.
Point being, we downgraded bash, tmux, and vim to match the versions working on other machines, but the problem remained.
At this point, I was sadly told that the machine was just going to get a bunch of stuff reinstalled and I shouldn't spend any more time poking at it. Sadness! But okay, fair enough, it was getting in people's way and the show must go on.
But wait! Things don't magically solve themselves after all!
Imagine my delight when I came in the next morning and heard that the reinstalling stuff hadn't fixed the problem! I'd been super bummed the day before to have my mystery stolen away from me, so this was very exciting! I hung out with a coworker for a bit to give him pointers on how to look into Elasticsearch bugs, then ran off to the biggest mystery of the week.
Aha! We noticed that we have a tmux-related vim plugin in our vim config - tmux-config. Bonus points for anyone who feels like stopping here to look at that and guess how this story ends. ^___^
I didn't have much time to play with it in the moment, but the very best thing happened - we were able to replicate it on any machine recently reimaged with our new workstation setup script! This meant I was able to get the bug onto my laptop! AW YEAH.
Wheeeeee I got to take my bug home with me to play with!
I sat down to take a closer look at that tmux-config plugin.
~/.vim/bundle/tmux-config/tmux-autowrite/autowrite-vim.sh creates a preexec function that’s called whenever you start up a new bash session, or enter any command in bash.
Line 31 reads:Commenting out that line causes the problem to go away. Running tmux send-keys -t %0 ^\\ ^n F19 WriteAll manually in another bash window causes the bug to manifest regardless. Perfect! What is this thing trying to do, and what is it actually doing?
Ah! In that same plugin, ~/.vim/bundle/tmux-config/plugin/autowrite.vim:45 defines this relevant mapping:
My notes from the moment my jaw dropped after one look at that:
I’m not sure what ^\\ and ^n are supposed to do - they don’t seem to be doing anything.
The rest is a mapping set in autowrite.vim:45 to save vim buffers when you do other stuff in the terminal, basically trying to mimic the way we have macvim set up to save on blur.
I’m not sure why yet, but F19 is what toggles capitalization and makes the damn beep. It only does the capitalization in vim inside tmux, not in vim outside tmux.
And then 'WriteAll' is interpreted as a normal vim command -
- ri replace the character under the cursor with an i
- e takes you to the end of a word
- A takes you to the end of the line and puts you into insert mode
- then ll is inserted at the end of the line
</ul>
Not sure why the preexec function gets run multiple times with each bash session/command, but that's what must be happening!
</blockquote>WOW. BUT WHY?!???
This is around the time I settled in to perform a series of experiments.
What happens if I send F19 alone with tmux send-keys?
Input:
Output:
BEEP BEEP BEEP &c, and if the file open in vim has any contents, the next <= 3 characters get their case toggled.
Huh, case gets toggled. Not just capitalized. Toggled. That's interesting.
What happens if I hit F19 in vim outside of tmux?
Aw, hell, my MacBook doesn't even have an F19 key! Yeargh. Fine, whatever, I went and installed KeyRemap4MacBook so I could remap fn-fn to F19 to test stuff with.
Result: No beeping or case toggling.
Why does F19 cause beeping/case toggling in vim inside tmux but not in vim outside tmux?
Am I super confident that my mapping worked properly? I mean, I tested it with EventViewer, but how realistic is that? Does tmux send-keys somehow send something different than what my mapping thinks I'm sending now?
How else can I test that F19 is what it claims to be?
I did some googling around, and learned that you can actually check how keystrokes are encoded in bash by opening up your terminal, hitting control-v, then hitting a key.
Whoa, neat, that seems useful! I checked encodings to see if I could find a difference, and oho, that jumped out at me!
- Inside tmux, F19 is encoded as ^[[33~
- (in our bash outside tmux, it’s ^[[18;2~ instead, dunno why)
</ul>
HOLD ON. Look at that more closely: inside tmux, F19’s encoding ends in '3~', which is exactly the command in vim that you’d expect to toggle case for 3 characters - COINCIDENCE? I THINK NOT.</strong>
Wait a second. 3~ looks super familiar for another reason! Oh, right, I'd noticed earlier that ~/.vim/bundle/tmux-config/plugin/autowrite.vim:35 set up some function keys like so:
My eyes had skimmed over that bit earlier, because it looked like it only went up to F9. I didn't bother to verify that assumption, just moved right past it. DAMNIT. Time to search the vim docs!
OH HEY t_F9 refers to F19 in vim ARGH ARGH ARGH HOW DID I MISS THAT.
So, that's inside a conditional. It doesn't always happen. What's &term? Well, it's whatever $TERM is in bash. Okay, let's verify that!
Inside tmux, $TERM was set to 'screen'. Outside tmux, $TERM was set to 'xterm-256color'.
256color... oh hell.
Some googling around turned up this useful answer on setting up tmux to handle xterm-style function key inputs. Setting that option did in fact make the bug go away! But that option wasn't set on any of our other computers, and we have all these other things hinting at a different solution.
Time to search that tmux-config plugin for screen-256color to see where it comes up OH GODDAMNIT ~/.vim/bundle/tmux-config/plugin/tmux.conf:9 sets:
With that option set, the conditional in autowrite.vim is satisfied, and (when vim is restarted after that option being set and all the vim plugins are sourced) t_F9 (which is secretly F19) is mapped to [33~.
OH. OH OH OH.
To sum up
(1) tmux wasn’t set to handle xterm-style function key inputs, because our tmux.conf wasn’t actually being copied
- from: ~/.vim/bundle/tmux-config/tmux-autowrite/tmux.conf
- to: ~/.tmux.conf
</ul>
(2) THEREFORE, tmux hadn’t received this config from our tmux.conf:
(3) SO, $TERM inside tmux was “screen” and outside tmux was “xterm-256color”
(4) This means tmux wasn’t set to handle xterm-style function keys (such as F19). This isn’t super-clear, to be fair. The clear way to set tmux to receive xterm function keys properly would be with “setw -g xterm-keys on”
(5) Vim checks $TERM to see if function keys are available. See the tmux FAQ. If they’re not, the character codes sent by the function keys are interpreted literally.
(6) We actually have vim set to interpret the higher function keys explicitly in autowrite.vim:35 - if $TERM is “screen-256color” (which happens explicitly in that tmux.conf we weren't using) then t_F9 (which is F19) is set to ^[[33~
(7) Why? Because (as I verified with control-v) inside tmux, F19 is encoded as ^[[33~
(8) Since we never explicitly set it otherwise, $TERM inside tmux was set to “screen” - which means that the condition in our autowrite.vim:35 was never met, and thus t_F9 was never set to ^[[33~ in vim.
(9) Because t_F9 was never mapped properly in our vim config, when that preexec function ran and bash sent “^\\ ^n F19 WriteAll” to tmux via tmux send-keys, vim escaped into normal mode because of ^\\ ^n and then interpreted the rest literally as ^[[33~WriteAll.
(10) And because the literal string ^[[33~WriteAll wasn’t mapped in vim (only <F19>WriteAll was!), each character was interpreted as a separate vim command, not part of a single mapping as intended.
^[[33~WriteAll as interpreted as a series of vim commands- ^[ is escape
- [3 doesn’t do anything (as far as I can tell)
- 3~ toggles case for the next three characters
- W takes you to the start of the next WORD
- ri replace the character under the cursor with an i
- te takes you to just before the next e
- A takes you to the end of the line and puts you into insert mode, and then
- ll is inserted at the end of the line
</ul>
</div>
Long story short, the fix was:
Process-related takeaways
Absence of evidence IS evidence of absence - we noticed pretty early on that there was no ~/.tmux.conf, then moved on, figuring that okay, guess there isn't anything weird in the config. Next time, if something is missing that seems like a likely place to look, I want to think of looking at whether analogous config files exist on working machines to compare sooner.
Verify ALL assumptions sooner (or at least the easy-to-check ones) - I noticed that t_F9 thing way earlier and skimmed past it, assuming that surely t_F9 referred to F9. That's an assumption that would've been super quick to verify! Gotta verify assumptions as they're made, especially ones that are quick and easy to check out.
Edit: And via the great discussion of this post on Hacker News: "Every bug in existence is a story of different software components doing exactly what they were told to." (Unless you count cosmic ray bugs, natch.)
18 Apr 2014
Talking about Debugging with the Ruby Rogues
I got to hang out and chat about debugging with the Ruby Rogues! I was totally flattered to be invited to be their guest for Ruby Rogues episode 150: The Debugging Mindset with Danielle Sucher, and had lots of fun recording the show.
It was so fantastic to just get to chat about science and problem-solving and trying to get better about putting our egos aside and really evaluating the evidence before us with such a great group of people.
It started like this…
DAVID: I bought a microscope yesterday. And there was a splotch on it and I couldn’t figure out what it is and I did the scientific method trying to figure out where in the microscope the splotch was coming from. Turns out, I was seeing a reflection of my optic nerve.
JAMES: Nice.
[Chuckles]
JOSH: Yeah, you can look in the microscope a really long time and you won’t find that.
DAVID: Yeah.
DANIELLE: So, when you gaze into the microscope, the microscope gazes back into you.
[Laughter]
DAVID: Also gazes back to me, yeah.
JOSH: [inaudible] Are you saying that what you see inside Dave’s eyes is the abyss?
DAVID: Yes.
DANIELLE: Yeah, yeah.
JAMES: I just want to know how he proved that hypothesis false. Did he gouge one of his eyes out?
[Laughter]
DAVID: Actually, and this is the part that I was very, very proud of, I finally switched eyes. And the splotch moved and changed shape.
So brilliant!
And this was my favorite quote of mine from the episode:
"Look, the goal is to prove that I’m wrong. That means I win. I’ve proved that I was stupid about something so I can move on to being stupid about something more interesting."
Really, you can just check out the whole episode here. Have fun!
21 Mar 2014
How I remember the names of things
Me: “Remembering the names of things is the worst! Like, I can never remember which one is the trainwreck rule.”
Dave: “That’s the Law of Demeter.”
Me: “Right, I also can never remember which one the Law of Demeter is, so that makes sense. But I know and understand the actual principle!”
Dave: “Think of the dots as grains of wheat, and Demeter is the goddess of the harvest! Or think of the e’s in ‘Demeter’ as the dots in the trainwreck?”
Me: “Nah, but I can think of the e’s as regex dots and visualize the trainwreck as /D.m.t.r/! Though to be fair, that would also match Damatar, Dumutur, Dimitir…”
Dave: “Ooh, that works perfectly - with ancient Egyptian, when we don’t know what a vowel sound really was, ‘e’ is actually used as the default vowel!”