I have a technical choice to make, and I don’t know the right answer – perhaps you can help me?
I apologise in advance for a long and technical posting – it will doubtless be “dumbed down” in the opinion of those who already understand the problem, and “full of mind-numbing details” for those who don’t.
The Line Endings Problem
Symbian has millions of lines of source code, stored in text files. We want that code to be accessible to people who have PCs running Windows, and we’d also like it to be accessible to people running Linux, but those systems use different sequences of characters in a file to indicate the end of a line of text. It’s a standard problem faced by any programming project which expects to work with both Windows and Linux, and it’s usually solved by having your source code control system automatically convert text files into the appropriate format for each user.
Symbian is using Mercurial to store the files, but a fundamental Mercurial design decision is not to associate a “type” with each file. Another design decision is to leave the content of the files unaltered – there is no automatic translation in core Mercurial.
The standard Mercurial solution to the line endings problem is to ask all Windows users to enable a Mercurial extension which recognises text files, converts them from Windows format to Linux format before storing them in the repository, and converts them from Linux format back into Windows format when extracting them onto your local disk.
All of the Symbian Platform code has been added into Mercurial from Windows using the Mercurial extension, so they are stored internally in the Linux format.: that’s spread over 100+ repositories. The Tools code has not been handled in that way, so a further 30+ repositories store files in the Windows format.
So What’s Your Dilemma?
At present, anyone working on the Symbian platform will be using Windows. Enabling the Mercurial extension has to be done by each user changing either their global Mercurial settings, or by altering the settings in individual local copies of the repositories. If you have the extension enabled and read from a repository in the Linux format, you will get warnings about line endings for every file that you extract. If you have the extension enabled and commit changes into a repository which doesn’t use the Linux format internally, then Mercurial will think that you are changing all of the files, even the ones you haven’t altered.
Clearly, I need to decide on a policy and apply it to all of the Symbian repositories. Equally clearly, the sooner the decision is made, the less disruption it will cause.
I can see two choices (which I will call Red and Blue because I don’t know which one is better than the other).
Red Choice – Linux format
Insist on using the Linux format for the internal storage in all repositories, requiring all Windows users to enable the extension.
This ought to make the files look right for all systems, regardless of how simple your compiler or text editor program is. A small snag is that there are some text files which are just data for use on the Symbian devices, and where the code currently has to have the Windows line-endings. The estart.txt file used to configure the Symbian file server is the one we’ve uncovered so far, and we would probably need to change the code to make it work with Linux line endings as well.
This solution forces all Windows users to enable the extension, and as there are 130+ repositories that’s probably something they would do in their global settings. Some other Open Source projects have taken this approach, e.g. NetBeans, but not very many, and could be very annoying for developers if they also work with code that has a different policy.
Blue Choice – No translation
Insist that Windows users do not use the extension, and instead require everyone to deal with the files in the format that they find them.
This is likely to mean that all text files are in the Windows format, and we just have to hope that Linux compilers and text editors will respect that and behave nicely.
It avoids making Windows users change anything, but if they have enabled the extension we could still get the whole repository changing “accidentally” unless they explicitly disable the extension for the Symbian Foundation repositories.
Help…!
What is the “common practice” for other projects which use Mercurial and support development on Windows?
Do you favour the Red choice over the Blue choice? Is there a Green choice which I haven’t considered?
Please let me know what you think I should do….
Thanks,
William


Could you use a source code control system that does associate a “type” with each file, and automatically converts to the right type for each system? Or, to put it another way, why Mercurial?
Just keep everything in Unix format and insist that Windows developers use the Unix format, with no automatic conversions. In 2009, every serious developer on Windows already uses an editor that understands Unix newline conventions. Many Windows developers even use the Unix newline conventions as their default choice. A huge percentage of the people that want to work on Symbian code will be using Carbide.c++ on Windows. Carbide.c++ on Windows seems to be able to handle Unix line endings just fine. Those that choose another editor just need to choose one that isn’t Notepad. (Note: I primarily use Windows, especially when doing anything Symbian related.)
This used to be much more interesting problem, when Mac OS was still using just CR as linebreak
http://en.wikipedia.org/wiki/Line_endings
Sorry, this wasn’t really helpful nor technical enough.
More than anything I can’t believe that this wasn’t debated and decided back before the SCM was chosen – never mind now that all the sources have been uploaded.
Anyway.
I’d say there are two things to look at here. Who will be impacted most by the decision and what is the “usual” thing to do in similar open source communities.
If the Nokia developers (who are, let’s face it going to make up the vast majority of people working with Foundation code for some time to come) were working directly in the Foundation’s Hg repository it may be a big deal to change as the vast majority of them are working on Windows. However, as they are likely to continue to work in Synergy and P4 for the foreseeable future that is less of an issue as potentially the line end switch could be done in the SynergyHg sync.
The other question is what is done in other large open source projects? What does Google, for example, do in its Git repository? I suspect, but don’t know for sure, that they leave them as Unix line ends since, as Brian says above, developers are either using Linux or are smart enough to understand the issue.
In short I personally would suggest using the Unix line endings. The bigger question is what happens when the code is sync’d into and out of Synergy and P4. Will Unix line endings contaminate the Nokia SCM? Will Windows line endings be automatically stripped when move is copied into Hg? That’s another whole set of decisions that need to be agreed with the Nokia SCM people.
I definitely agree with Brian, simply choose one line-ending format, and have the user’s editors deal with it.
My personal preference would be to use Unix line-endings, but even if they were all Windows line-endings, and I needed to setup emacs/vim to deal with it, it wouldn’t be the end of the world.
Definitely better than coverting back and forth.
If an ISV wants to reach a global Linux audience, they must support more than one distribution of Linux. These challenges and variances make it difficult–and costly–for ISVs to target the Linux platform.
The Linux Standard Base was created to solve these challenges and lower the overall costs of supporting the Linux platform. By reducing the differences between individual Linux distributions, the LSB greatly reduces the costs involved with porting applications to different distributions, as well as lowers the cost and effort involved in after-market support of those applications.
Anyone who wants to help in the LSB process is always welcome.
http://www.linuxfoundation.org/collaborate/workgroups/lsb
I also agree with Brian, choose one format and let the users handle it.
I just did a quick test here and all of my Linux editors (I tried Kate and vim) correctly handled Windows line-breaks (i.e. they showed them correctly and didn’t change chem).
However, the windows editor didn’t handle the Linux-style endings. So sticking to the Windows-style might even be the better option.
One thing you want to make sure is, however, that users don’t commit if their editor has screwed up the line breaks…
I vote for Unix line endings.
I sincerely hope there is an official Linux version of the SDK on the road map, at the moment we have to maintain windows boxes just for Symbian development, and everybody here would be happier with a Linux version.
Martin Storsjo (http://martin.st/symbian/) has provided a working SDK for Linux. But the craptacular epoc emulator doesn’t work.
By the way David, could you write a post about the Symbian foundation position on the state of the Symbian SDK, and what the future may hold?
There have been great strides made by switching to Eclipse and giving away Carbide for free, but there is still a long way to go compared to the Android or iPhone SDKs.
I am specifically thinking of things like:
* Warnings in the SDK header files. How can this pass QA?
* The sad state of the “emulator”. Why not use an ARM emulator and scrap the X86 builds?
* Needing separate SDK distributions for separete phones (one Samsung, one S60 5th, one N97…)
* API Documentation quality, especially on the S60 side
* The arcane build system: bld.inf, MMPs, bldmake, ABLD, bizarre file formats like PKG…
And of course:
* Symbian signed, especially certified signed which is a royal PITA
It is obvious from the graphical and branding side of things that Symbian foundation takes public perception seriously. But the quality of the SDK leaves a bad taste in developer’s mouths.
My vote is for Unix format line endings. Yes being a Linux user I would of course vote for this, but there is a reason other than zealotry.
The Linux ecosystem has a huge pool of very good programmers (yes there are some bad ones too), and for Symbian to grow it’s community it needs to be accessible to all. If it is just a matter of enabling a plugin on Windows then I think that would be the least disruptive. I presume Mac OS follows the Unix convention, so the choice kind of comes down to how many groups of developers do you want to inconvenience (even if it is a small inconvenience). From my discussions and dealings with developers they are inherently lazy (no I’m not being rude, Ruby for instance promotes lazyness – don’t duplicate code). As such I think some developers may not bother contributing because there is a small barrier in front of them.
Yes this may have been better if it was brought up at the time of SCM choice, but hats off to you for highlighting your issue and requesting feedback on how best to go.
I guess I would use the Unix line endings for everything and force the extension to be used for Windows. As I understand it, this is how we use Perforce at the moment (the client spec LineEnd option defaults to ‘local’, i.e convert to local convention on sync but submit in UNIX format).
So, does this mean that estart.txt etc. are stored with Unix line endings in Perforce at the moment? So it’s just coincidence that when you sync onto a Windows PC, estart.txt is converted into the expected format? I guess this has just worked up to now because the emulator only works on Windows anyway?
What’s the line-end convention for Symbian OS, if there is one? Perhaps the Symbian build system should convert text files to the expected line endings when exporting data?
> I sincerely hope there is an official Linux version of the SDK on the road
> map, at the moment we have to maintain windows boxes just for Symbian development
Hopefully that requirement will end by the end of the year. Raptor already works on Linux and Carbide is moving that way as we speak.
> But the craptacular epoc emulator doesn’t work.
The craptacular is four years or more out of official support and should, dear god, by replaces with either QEMU or a freeware distribution of ARMs own simulator as soon as physically possible.
how about a solution where the master repositry is consistant (unix or windws doesn’t matter) and when somebody commits a scripts checks for consistancy of line endings and if there is a clash it rejects the commit
cheers
Mark
I’d suggest not requiring either line ending style or doing any conversions, but require that developers not change the line ending style for a given file to avoid meaningless changes being reported. Most programmer’s editors should be able to cope with this, and hopefully then there’d be files with different line ending styles motivating people to ensure that their software can cope with both endings.
Casing of filenames and #includes tends to also be an issue for Linux users (and generally for those on case sensitive filesystems). Are there checks in place to ensure consistency with casing?
How do Python deal with their Mercurial relates issues now ? May be there’s a clue out there ?
Hi, Nithya. Last time I checked, the Linux (and other *nixes) version of Python choked whenever it encountered a file with Windows/MS-DOS line endings, and I think some versions of GCC also have problems. I don’t have a Windows machine at home at the moment, so I don’t know if there’s a problem in the other direction.
I agree that editors and tools ought to be able to handle all flavours of line-endings (and I think, with the exception of Windows Notepad, most do) regardless of the OS they run on. Having said that, I still think it’s a good idea to standardise the line-ending style across all the Symbian source-code. Apart from anything else it just feels cleaner and tidier to me.
My vote would go for using *NIX style line-ends. Apart from the reasons already mentioned, they’re just shorter – one byte instead of two! I know it’s a bit daft, but if you’re storing 40 million lines of code that equates to something in the region of 40 megabytes of wasted space! Not much perhaps, but you know what they say: Waste not, want not!
Btw, @Tero: Filenames and references to them (like in #include’s) in the Symbian source code are already case-consistent to support builds on Linux.
Hi,
It can be either way as long as the users don’t need any plugins or special settings. Things are already complex enough to set up and start. Seasoned developers may find it convoluted, but for begginers, there is just too much magic.
I happened to have seen the reasons for choosing Mercurial and the case was at best questionable. Subversion was discarded on the basis that it is ‘not distributed’ which is super falacy. Subversion works in a similar way to Mercurial (e.g. you can work offline, no problem), it has much more familiar paradigms, and there is no need whatsoever for separate repositories. And it scales brilliantly.
Anyway, please try to keep things simple going forward!
Thanks,
Ivan
@Ivan
svn can work off line in that you can edit your files. But you can’t checkin, revert without having a connection to the central repository.
My vote is for *nix format of line-ends.
I think it is important we support all 3 major desktop platforms. Windows, Linux and Mac OS X, If two of the platforms (Linux and Mac OS X) can be common then that should be the native format.
What does Mozilla do?
The problem is that both Mercurial and Git do not inherit any setting for EOL handling from the repository the developer is cloning from, but it’s entirely on the responsibility of the developer to set the right settings, and there is no way to enforce any policy (other than to automatically fail push & commit on the master repos if they don’t meet the requirements).
The EOL issue was considered in the selection process, so it’s not a surprise. Mercurial and Git were rated equal
Currently the data internally (in Nokia and xSymbian) is stored in a unix system, on Synergy & Perforce systems, with EOL’s converted to unix-style. The Synergy and Perforce Windows clients do the conversion to and from Windows format automatically.
Most of the large open source projects are moving from centralized version control (Subversion, CVS) to distributed (mostly Git and Mercurial). In my opinion it would have been a huge mistake to not choose a distributed open source version control system for the Foundation.
I vote for RED.
I just checked out the code extracted from the repository with Carbide.c++ (Mercurial Eclipse plug-in, default settings). The code looks good and guess what, the files have DOS line endings (no, it’s not Tools code). Is said extension enabled? If yes, I have not done it so we can assume it to be default and thus a de facto standard for how the files will be handled when accessed through the only Symbian IDE.
Its funny to see all the pro-Linux votes above at a time when Linux is at best a potential option for Symbian development. It would have been more interesting not to have this post, use the Windows files and then see how many complaints would have been received about the issue. I would assume that far less than the pro-Linux votes we see now
Can somebody please share his/her Symbian on Linux development setup? There are many developers that would indeed like to use something like that.
I vote thus for the blue pill, we all know that Linux has the best tools ever so of course their compilers and editors can handle this little problem.
My preferences would be to avoid any “magical” transformations in any of the repositories. That has several benefits:
No need to expend effort making sure every machine is configured,
If a repository needs to store two files with different line endings, it can,
Doesn’t impinge upon using other Hg repos on the same machine.
The (only?) benefit of enabling the magical transformation would be to have files appear with the native line endings on each developer machine. However, as expressed above, no developer tools actually care (I’ll avoid describing Notepad as a developer tool
.
The other thing to consider is: what line-endings does the Symbian code itself expect?
If a Linux machine were used to build a ROM, and builds it with Unix line-endings, but the Symbian code is expecting DOS line-endings, then the ROM may not run correctly. If a Windows machine were used to do the same thing, and the transformation were switched off, then it would encounter exactly the same problem. (So far we’ve encountered exactly this problem, where the efect was that the emulator didn’t even boot.)
If transformations are off then the DOS line endings are preserved on all systems, and code that expects them will continue to work.
I’m not sure that there’s any benefit to standardising on any line-ending accross the whole codebase. Is there any tangible problem caused by having them all mixed up?
If no effective solution is in sight to take care of all flavours, then it might help to provide a batchscript file to perform the ‘unix2dos’ commands.
New file additions to the project may have to remember tinkering this file as well.
I think there is a problem in Blue argument: “hope that Linux compilers and text editors will respect that and behave nicely.”
IIRC UNIX based editors typically will only just not go out of their way to delete carriage returns. If you’re editing the file I don’t think they would ‘typically’ detect this is a DOS text file and add carriage returns as appropriate.
Also if you created a new estart.txt on your Linux distro you still get the problem of how the Symbian OS translates it. So the relevant component will still have to consider how to deal with this.
You also have to assume that all Windows editors can cope without carriage returns if two developers on different OS’s add lines to the same source file.
I would definitely swallow the RED pill – and no, it’s not because I’m particular pro *nix
The Win32text extension, which is enabled by default on windows installations, assumes that ASCII files are stored in *nix format. The Win32text ext. ensured that both windows and *nix uses ends up with ASCII files in their native format, so everyone should be happy… except Mac OS users up to version 9 and OS-9, which uses CR (Mac OS X uses LF, like *nix)
See http://www.selenic.com/mercurial/wiki/Win32TextExtension for more info about the Win32Text extension
+1 for RED
+1 for Mercurial
I have given up on the unfortunate Microsoft choice of using CR+LF a long time ago. More and more programmers editors for Windows have native support for just LF as a line ending separator. I say release only linux formatted code and let Windows programmers deal with it – any self respecting programmer will know how to do it. Also you will be collaborating to end the stupidity that CR+LF is.
Wow – thanks everyone for responding. I’ll try to summarise what I’ve learned from your postings and where I think this is heading…
First up, there were a few questions which need an answer.
Why Mercurial?
We wanted a truly distributed SCM system. I’ll have the details of the evaluations etc posted on the website rather than go into them here. I was the person who checked the Linux/Windows compatibility angle, so perhaps I didn’t do a very good job
What’s happening about the SDK, replacing the emulator, working on Linux?
Good questions, and I’ve asked for people to address them with new blog postings for you to comment on.
What do Mozilla and Python do?
Mozilla’s FAQ admits that they don’t know, but they want Linux line-endings in text files. Python only decided on 30th March to adopt Mercurial, so there’s no information yet. Mercurial itself has Linux line endings in the repository. It looks to me as though NetBeans is the exception in actively requiring Windows users to configure the extension.
From your comments, I conclude that:
The approach I’m looking at is therefore a variation on BLUE (no compulsion about format) which aims to have the bulk of the files delivered into the Foundation using the Linux line-ending format. For that to work, I have to agree with the Nokia Configuration Management folk about how they arrange their delivery of code changes from their Perforce and Synergy systems – I will make a new blog posting to let you know how that goes, but feel free to comment further in this one.
Finally, I turned off the line-ending conversion stuff in my Mercurial.ini file while examining the Mozilla-central and hg-stable repositories, and I don’t think I’ll bother to turn it back on.
Thanks again,
William
I think I would favour the Blue choice (work with what you’ve got), but perhaps you could produce a hook that prevented the commit of line-end-only changes?
I would go for the “make it easy” for the developer.
My approach would be to make one choice (which one really does not matter) and let whatever tool is used to upload the code to deal with it by autodetecting and being drastic about it. (I would expect the IDE to automatically have an option to prepare the files for the checking in process anyway).
To not feel like I have not answered your question I would recommend to align the choice with the strategy (or at least the prediction) of what the developer base will be. The Symbian foundation will want to make it easy for developers (of the OS and of the Applications). Does the foundation expect those crowds to be distinct or to have potential overlap (do you expect people to become interested in the os because they are app developer or vice versa). If the crowds are distinct keep what makes the buillds break least (could be what people are used to or what techonology is most robust, your call). If the vision is to bring people from the developer realm to be more interested in the OS and reduce their pain, look at what you expect your developer to be in 3 years time and choose what they would choose.
I definitely agree with Deny Watanabe. Unix line endings would be my personal choice. I didn’t experienced any difficulties with this format when I was collaborating on the development in C/C++ (for PC) — few of us were using Windows and the rest of the team was using Linux (we were using SVN and no explicit EOL settings).
Glad so many entries from computer pros win/linux sides have contributed! This is what the Symbian-Foundation is trying to foster in so many aspects. However for us users the coders need to seriously TRIAL the ROM’s and apps they make for 1month daily and only device being Symbian to run their work to understand what users go through when only tested in a closed 1 app running environment; yet cannot perform with 8apps running side by side.