Thursday, October 29, 2009

How to Convert DOS/Windows and UNIX text files

The format of Unix and DOS/Windows text files differs in the way the lines end. In DOS/Windows the lines end with both carriage return and line feed ASCII characters, but Unix uses only line feed.

So, some applications in Unix may display the carriage returns from a DOS/Windows text file, for example:
   line one^M
line two^M
And some applications in Windows may not display the line breaks from a Unix text files, for example:
   line one
line two
There are many ways to solve this problem, here we will use only sed and tr utilities to convert one file format to another.

Converting from Unix to DOS/Windows:
   sed -e 's!$!\r!g' unixfile.txt > winfile.txt

Converting from DOS/Windows to Unix:
   sed -e 's!\r$!!g' -e 's!\x1A!!g' winfile.txt > unixfile.txt
or
   tr -d "\32\r" < winfile.txt > unixfile.txt
You can't use tr to convert a file from Unix format to DOS/Windows format.
Notice that control Z (\32) is also removed.

If the input file will be also the output file, you can use the -i option (in place option):
   sed -i -e 's!$!\r!g' samefile.txt
and
   sed -i -e 's!\r$!!g' -e 's!\x1A!!g' samefile.txt
Now if you have to use these commands a lot, you can create some shell functions, for example, in bash:
   function unix2dos {
sed -e 's!$!\r!g' $1 > $2
}

function dos2unix {
sed -e 's!\r$!!g' -e 's!\x1A!!g' $1 > $2
}

dos2unix winfile.txt unixfile.txt

unix2dos unixfile.txt winfile.txt

Labels: , , , , ,