GNU Info

Info Node: (cvsbook.info)RCS Format

(cvsbook.info)RCS Format


Next: What Happens When You Remove A File Prev: Repository Structure Up: Repository Administration
Enter node , (file) or (file)node

RCS Format
==========

You do not need to know any of the RCS format to use CVS (although there
is an excellent writeup included with the source distribution, see
doc/RCSFILES).  However, a basic understanding of the format can be of
immense help in troubleshooting CVS problems, so we'll take a brief peek
into one of the files, `hello.c,v'.  Here are its contents:

     head     1.1;
     branch   1.1.1;
     access   ;
     symbols  start:1.1.1.1 jrandom:1.1.1;
     locks    ; strict;
     comment  @ * @;
     
     1.1
     date     99.06.20.17.47.26;  author jrandom;  state Exp;
     branches 1.1.1.1;
     next;
     
     1.1.1.1
     date     99.06.20.17.47.26;  author jrandom;  state Exp;
     branches ;
     next;
     
     desc
     @@
     
     1.1
     log
     @Initial revision
     @
     text
     @#include <stdio.h>
     
     void
     main ()
     {
       printf ("Hello, world!\n");
     }
     @
     
     1.1.1.1
     log
     @initial import into CVS
     @
     text
     @@

Whew!  Most of that you can ignore; don't worry about the relationship
between 1.1 and 1.1.1.1, for example, or the implied 1.1.1 branch -
they aren't really significant from a user's or even an administrator's
point of view.  What you should try to grok is the overall format.  At
the top is a collection of header fields:

     head     1.1;
     branch   1.1.1;
     access   ;
     symbols  start:1.1.1.1 jrandom:1.1.1;
     locks    ; strict;
     comment  @ * @;

Farther down in the file are groups of meta-information about each
revision (but still not showing the contents of that revision), such as:

     1.1
     date     99.06.20.17.47.26;  author jrandom;  state Exp;
     branches 1.1.1.1;
     next     ;

And finally, the log message and text of an actual revision:

     1.1
     log
     @Initial revision
     @
     text
     @#include <stdio.h>
     
     void
     main ()
     {
       printf ("Hello, world!\n");
     }
     @
     
     1.1.1.1
     log
     @initial import into CVS
     @
     text
     @@

If you look closely, you'll see that the first revision's contents are
stored under the heading 1.1, but that the log message there is "Initial
revision", whereas the log message we actually used at import time was
"initial import into CVS", which appears farther down, under `Revision
1.1.1.1'.  You don't need to worry about this discrepancy right now.
It happens because imports are a special circumstance: In order to make
repeated imports into the same project have a useful effect, import
actually places the initial revision on both the main trunk and on a
special branch (the reasons for this will become clearer when we look
at vendor branches in Note: Advanced CVS).  For now, you can treat
`1.1' and `1.1.1.1' as the same thing.

The file becomes even more revealing after we commit the first
modification to hello.c:

     floss$ cvs -Q co myproj
     floss$ cd myproj
     floss$ emacs hello.c
         (make some changes to the file)
     
     floss$ cvs ci -m "print goodbye too"
     cvs commit: Examining .
     cvs commit: Examining a-subdir
     cvs commit: Examining a-subdir/subsubdir
     cvs commit: Examining b-subdir
     Checking in hello.c;
     /usr/local/newrepos/myproj/hello.c,v  <--  hello.c
     new revision: 1.2; previous revision: 1.1
     done

If you look at hello.c,v in the repository now, you can see the effect
of the commit:

     head  1.2;
     access;
     symbols
           start:1.1.1.1 jrandom:1.1.1;
     locks; strict;
     comment   @ * @;
     
     1.2
     date   99.06.21.01.49.40;   author jrandom;   state Exp;
     branches;
     next   1.1;
     
     1.1
     date   99.06.20.17.47.26;   author jrandom;   state Exp;
     branches
            1.1.1.1;
     next   ;
     
     1.1.1.1
     date   99.06.20.17.47.26;   author jrandom;   state Exp;
     branches;
     next   ;
     
     desc
     @@
     
     1.2
     log
     @print goodbye too
     @
     text
     @#include <stdio.h>
     
     void
     main ()
     {
       printf ("Hello, world!\n");
       printf ("Goodbye, world!\n");
     }
     @
     
     1.1
     log
     @Initial revision
     @
     text
     @d7 1
     @
     
     1.1.1.1
     log
     @initial import into CVS
     @
     text
     @@

Now the full contents of Revision 1.2 are stored in the file, and the
text for Revision 1.1 has been replaced with the cryptic formula:

     d7 1

The `d7 1' is a diff code that means "starting at line 7, delete 1
line".  In other words, to derive Revision 1.1, delete line 7 from
Revision 1.2!  Try working through it yourself.  You'll see that it
does indeed produce Revision 1.1 - it simply does away with the line we
added to the file.

This demonstrates the basic principle of RCS format: It stores only the
differences between revisions, thereby saving a lot of space compared
with storing each revision in full.  To go backwards from the most
recent revision to the previous one, it patches the later revision using
the stored diff.  Of course, this means that the further back you travel
in the revision history, the more patch operations must be performed
(for example, if the file is on Revision 1.7 and CVS is asked to
retrieve Revision 1.4, it has to produce 1.6 by patching backwards from
1.7, then 1.5 by patching 1.6, then 1.4 by patching 1.5).  Fortunately,
old revisions are also the ones least often retrieved, so the RCS system
works out pretty well in practice: The more recent the revision, the
cheaper it is to obtain.

As for the header information at the top of the file, you don't need to
know what all of it means.  However, the effects of certain operations
show up very clearly in the headers, and a passing familiarity with them
may prove useful.

When you commit a new revision on the trunk, the `head' label is
updated (note how it became 1.2 in the preceding example, when the
second revision to hello.c was committed).  When you add a file as
binary or tag it, those operations are recorded in the headers as well.
As an example, we'll add foo.jpg as a binary file and then tag it a
couple of times:

     floss$ cvs add -kb foo.jpg
     cvs add: scheduling file 'foo.jpg' for addition
     cvs add: use 'cvs commit' to add this file permanently
     floss$ cvs -q commit -m "added a random image; ask jrandom@red-bean.com why"
     RCS file: /usr/local/newrepos/myproj/foo.jpg,v
     done
     Checking in foo.jpg;
     /usr/local/newrepos/myproj/foo.jpg,v  <--  foo.jpg
     initial revision: 1.1
     done
     floss$ cvs tag some_random_tag foo.jpg
     T foo.jpg
     floss$ cvs tag ANOTHER-TAG foo.jpg
     T foo.jpg
     floss$

Now examine the header section of foo.jpg,v in the repository:

     head   1.1;
     access;
     symbols
           ANOTHER-TAG:1.1
           some_random_tag:1.1;
     locks; strict;
     comment   @# @;
     expand	@b@;

Notice the b in the expand line at the end - it's due to our having
used the -kb flag when adding the file, and means the file won't undergo
any keyword or newline expansions, which would normally occur during
checkouts and updates if it were a regular text file.  The tags appear
in the symbols section, one tag per line - both of them are attached to
the first revision, since that's what was tagged both times. (This also
helps explain why tag names can only contain letters, numbers, hyphens,
and underscores.  If the tag itself contained colons or dots, the RCS
file's record of it might be ambiguous, because there would be no way to
find the textual boundary between the tag and the revision to which it
is attached.)

RCS Format Always Quotes @ Signs
================================

The `@' symbol is used as a field delimiter in RCS files, which means
that if one appears in the text of a file or in a log message, it must
be quoted (otherwise, CVS would incorrectly interpret it as marking the
end of that field).  It is quoted by doubling - that is, CVS always
interprets `@@' as "literal @ sign", never as "end of current field".
When we committed foo.jpg, the log message was

     "added a random image; ask jrandom@red-bean.com why"

which is stored in foo.jpg,v like this:

     1.1
     log
     @added a random image; ask jrandom@@red-bean.com why
     @

The @ sign in jrandom@@red-bean.com will be automatically unquoted
whenever CVS retrieves the log message:

     floss$ cvs log foo.jpg
     RCS file: /usr/local/newrepos/myproj/foo.jpg,v
     Working file: foo.jpg
     head: 1.1
     branch:
     locks: strict
     access list:
     symbolic names:
           ANOTHER-TAG: 1.1
           some_random_tag: 1.1
     keyword substitution: b
     total revisions: 1;	selected revisions: 1
     description:
     ----------------------------
     revision 1.1
     date: 1999/06/21 02:56:18;  author: jrandom;  state: Exp;
     added a random image; ask jrandom@red-bean.com why
     ============================================================================
     
     floss$

The only reason you should care is that if you ever find yourself
hand-editing RCS files (a rare circumstance, but not unheard of), you
must remember to use double @ signs in revision contents and log
messages.  If you don't, the RCS file will be corrupt and will probably
exhibit strange and undesirable behaviors.

Speaking of hand-editing RCS files, don't be fooled by the permissions
in the repository:

     floss$ ls -l
     total 6
     -r--r--r--   1 jrandom   users         410 Jun 20 12:47 README.txt,v
     drwxrwxr-x   3 jrandom   users        1024 Jun 20 21:56 a-subdir/
     drwxrwxr-x   2 jrandom   users        1024 Jun 20 21:56 b-subdir/
     -r--r--r--   1 jrandom   users         937 Jun 20 21:56 foo.jpg,v
     -r--r--r--   1 jrandom   users         564 Jun 20 21:11 hello.c,v
     
     floss$

(For those not fluent in Unix ls output, the `-r--r--r--' lines on the
left essentially mean that the files can be read but not changed.)
Although the files appear to be read-only for everyone, the directory
permissions must also be taken into account:

     floss$ ls -ld .
     drwxrwxr-x   4 jrandom   users        1024 Jun 20 22:16 ./
     floss$

The myproj/ directory itself - and its subdirectories - are all
writeable by the owner (jrandom) and the group (users).  This means that
CVS (running as jrandom, or as anyone in the users group) can create and
delete files in those directories, even if it can't directly edit files
already present.  CVS edits an RCS file by making a separate copy of it,
so you should also make all of your changes in a temporary copy, and
then replace the existing RCS file with the new one. (But please don't
ask why the files themselves are read-only - there are historical
reasons for that, having to do with the way RCS works when run as a
standalone program.)

Incidentally, having the files' group be `users' is probably not what
you want, considering that the top-level directory of the repository
was explicitly assigned group `cvs'.  You can correct the problem by
running this command inside the repository:

     floss$ cd /usr/local/newrepos
     floss$ chgrp -R cvs myproj

The usual Unix file-creation rules govern which group is assigned to new
files that appear in the repository, so once in a while you may need to
run chgrp or chmod on certain files or directories in the repository
(setting the SGID bit with `chmod g+s' is often a good strategy: it
makes children of a directory inherit the directory's group ownership,
which is usually what you want in the repository).  There are no hard
and fast rules about how you should structure repository permissions;
it just depends on who is working on what projects.


automatically generated by info2www version 1.2.2.9