LooselyTyped

(map blog thoughts)

Nov 12, 2017 - devops

Know Your Line Endings

Working with git in a multi-OS setting can lead to some rather bizarre behavior, especially if git’s eol (end-of-line) settings are not configured correctly.

This behavior (in our case) was further exacerbated when working with shell scripts, and building docker images.

Setting up the scene

Let us imagine we have a git repository containing a shell script (aptly named list), and a Dockerfile. Here are the contents of list:

list
#!/usr/bin/env sh

ls -al /tmp

If you are playing along, then be sure to make list an executable with chmod.

Here is our (rather simplistic, and admittedly pointless) Dockerfile.

Dockerfile
FROM alpine:3.4

COPY list /usr/bin/ (1)
RUN list (2)
1 COPY list from the current directory into the container’s bin directory
2 Execute the script as part of building the image

Great! Now we can build a docker image (with the tag demo) like so:

Building a docker image
> docker build -t demo .

If all goes well, you should see something like this:

Docker output
Sending build context to Docker daemon  50.69kB
Step 1/3 : FROM alpine:3.4
 ---> 016182cd451a
Step 2/3 : COPY list /usr/bin/
 ---> 16e0df706fcf
Step 3/3 : RUN list
 ---> Running in d1302130bd13
total 8
drwxrwxrwt    2 root     root          4096 May  9  2017 .
drwxr-xr-x   25 root     root          4096 Nov 10 16:41 ..
Removing intermediate container d1302130bd13
 ---> 8a2292841f23
Successfully built 8a2292841f23
Successfully tagged demo:latest

We can see the contents of the tmp directory listed as part of building the image.

Now, for grins and giggles, use your favorite text editor to convert the line-endings in list to be CRLF. In Sublime Text 3 this option is available under ViewLine Endings. In VS Code there is a toggle at the bottom right hand of the editor that allows you to switch from LF to CRLF. Be sure to save the file!

Now, let us attempt to build our Docker image again. Here is the resulting output:

Building an image again
> docker build -t demo .
Sending build context to Docker daemon  56.32kB
Step 1/3 : FROM alpine:3.4
 ---> 016182cd451a
Step 2/3 : COPY list /usr/bin/
 ---> Using cache
 ---> e523cf78e1a3
Step 3/3 : RUN list
 ---> Running in ff08de7f9a94
': No such file or directory
The command '/bin/sh -c list' returned a non-zero code: 127

One might not be surprised that this did not work — we are introducing invalid line delimiters in a shell script. However the error is a tad baffling. We can attempt to debug this by commenting out the RUN list instruction in our Dockerfile, building the image again, and then seeing if the file was indeed copied over.

Building the image again without RUN list
> docker build -t demo . (1)
Sending build context to Docker daemon  56.32kB
Step 1/2 : FROM alpine:3.4
 ---> 016182cd451a
Step 2/2 : COPY list /usr/bin/
 ---> Using cache
 ---> e523cf78e1a3
Successfully built e523cf78e1a3
Successfully tagged demo:latest
> docker run --rm demo ls -al /usr/bin/list (2)
-rwxr-xr-x    1 root     root            34 Nov 10 16:46 /usr/bin/list
> docker run --rm demo ./usr/bin/list (3)
': No such file or directory
1 Build the image again
2 See if /usr/bin/list exists
3 Attempt to execute it

The file exists, however, both the RUN instruction, as well as our attempt to manually execute it via docker run fail.

So what has this got to do with git?

Git configuration

It turns out, that if this code was checked into a git repository, and someone on Windows cloned the codebase, by default Git will attempt to convert the line-endings into CRLF! In almost all cases this is what you want. Except when you don’t :)

You can easily test by initializing a git repository (assuming you are playing along) in the current directory, and then forcing git to checkout all files with CRLF like so:

Simulating the problem
> git init;
> git add .;
> git commit -m "initial commit with LF line-endings"; (1)

> git config --local core.eol crlf;
> git config --local core.autocrlf true;  (2)
> git rm --cached -r . && git reset --hard; (3)

> git status; (4)
On branch master
nothing to commit, working tree clean
> docker build -t demo .; (5)
Sending build context to Docker daemon  55.81kB
Step 1/3 : FROM alpine:3.4
 ---> 016182cd451a
Step 2/3 : COPY list /usr/bin/
 ---> Using cache
 ---> e523cf78e1a3
Step 3/3 : RUN list
 ---> Running in a1e801e1faaf
': No such file or directory
The command '/bin/sh -c list' returned a non-zero code: 127
1 Initialize, add and commit all files with LF line-endings
2 Install default settings for Windows
3 Force git to re-checkout all file
4 Notice that git status reports a clean working directory
5 docker build fails

Notice that git status reports a clean working directory. Here lies the part where you are left scratching your head when you attempt to debug this problem. It’s the same repository, with the same code-base, and git reports that nothing has been touched!

Solution

If you wish to preserve CRLF on Windows, but maintain LF across your text files on OSX/Linux, as well as in git repository, then the .gitattributes file is your friend. This answer on Stack Overflow provides a rather comprehensive treatment on the subject.

However, we must realize that our situation is a tad different — in that, we not only want git to normalize line-endings in the repository, but keep those line-endings in the working directory. In other words, we have to inform git to keep LF both when checking in, and checking out files. The solution is to explicitly set the line-ending normalization to be LF like so:

.gitattributes
* text=auto eol=lf

As described on the .gitattributes man page this will ensure that ALL files have LF in the working directory.

Admittedly our example is rather trivial. In a real project you will have other kinds of files where you want native line-endings in the working directory. In that case, you will have to fine tune the .gitattributes file — you will have to explicitly set the eol attribute for specific files, or using wild-cards, while ensuring that all files have LF when checked in, but upon checkout default to the native line-endings.

Let’s say our project had a few more files — some shell scripts, a bat file, some Java source code, and the aforementioned Dockerfile and list file. You might have a .gitattributes file that looks like this:

Slightly more comprehensive .gitattributes file
*.sh  eol=lf    (1)
*.bat eol=crlf  (2)

list  eol=lf    (3)

*     text=auto (4)
1 Wild-card for all shell scripts to have LF in the working directory
2 Wild-card for all bat files to have CRLF in the working directory
3 Explicitly call out list since the wild-cards will not catch it
4 Treat everything else as git would do by default

We should note that the .gitattributes, much like .gitignore in parsed top-to-bottom. Effectively, patterns at a later line could override those before them. With that in mind, it makes sense (in this case) to have the default at the bottom.

Hope this helps!