Chapter 16. Fetching

Fetching And Updating

A convention about using the "update" and "fetch" targets makes it easy for users to know how to use a recipe. The main recipe for a project should be able to be used in three ways:

  1. Without specifying a target.

    This should build the program in the usual way. Files with a "fetch" attribute are obtained when they are missing.

  2. With the "fetch" target.

    This should obtain the latest version of all the files for the program, without building the program.

  3. With the "update" target.

    This should fetch all the files for the program and then build it. It's like the previous two ways combined.

Here is an example of a recipe that works this way:

        Status = status.txt
        Source = main.c version.c
        Header = common.h
        Target = myprog

        $Target : $Source $Status
            :cat $Status
            :do build $source

        # specify where to fetch the files from
        :attr {fetch = cvs://:pserver:anonymous@myproject.cvs.sourceforge.net:/cvsroot/myproject} $Source $Header
        :attr {fetch = ftp://ftp.myproject.org/pub/%file%} $Status

Note that the header file "common.h" is given a "fetch" attribute, but it is not specified in the dependency. The automatic dependency checking will notice the file is used and fetch it when it's missing.

When using files that include a version number in the file name, fetching isn't needed, since these files will never change. To reduce the overhead caused by checking for changes, give these files a "constant" attribute (with a non-empty non-zero value). Example:

        PATCH = patches/fix-1.034.diff {fetch = $FTPDIR} {constant}

To fetch all files that have a "fetch" attribute start Aap with this command:

        aap fetch

When the "fetch" target is not specified in the recipe or its children, it is automatically generated. Its build commands will fetch all nodes with the "fetch" attribute, except ones with a "constant" attribute set (non-empty non-zero). To do the same manually:

        fetch:
                :fetch $Source $Header $Status

Or use the :fetchall command.

NOTE: When any child recipe defines a "fetch" target no automatic fetching is done for any of the recipes. This may not be what you expect.

When there is no "update" target it is automatically generated. It will invoke the "fetch" target and the default target(s) of the recipe. To do something similar manually:

        update: fetch $Target

The Fetch Attribute

The "fetch" attribute is used to specify a list of locations where the file can be fetched from. The word at the start defines the method used to fetch the file:

ftpfrom ftp server
httpfrom http (www) server
scpsecure copy
rcpremote copy (aka insecure copy)
rsyncremote sync
filelocal file system
cvsfrom CVS repository For a module that was already checked out the part after "cvs://" may be empty, CVS will then use the same server (CVSROOT) as when the checkout was done.
otheruser defined

These kinds of locations can be used:

        ftp://ftp.server.name//full/path/file
        ftp://ftp.server.name/relative/path/file
        http://www.server.name/path/file
        scp://host.name/path:path/file
        rcp://host.name/path:path/file
        rsync://host.name/path:path/file
        cvs://:METHOD:[[USER][:PASSWORD]@]HOSTNAME[:[PORT]]/path/to/repository
        file:~user/dir/file
        file:///etc/fstab

For a local file there are two possibilities: using "file://" or "file:". They both have the same meaning. "file:" is preferred, because the double slash is usually used before a machine name: "method://machine/path". A file is always local, thus leaving out "//machine" is the logical thing to do.

Note that for an absolute path, relative to the root of the file system, you use either one or three slashes, but not two. Thus "file:/etc/fstab" and "file:///etc/fstab" are the file "/etc/fstab". A relative path has two or no slashes, but keep in mind that moving the recipe will make it invalid. You can also use "file:~/file" or "file://~/file" for a file in your own home directory, and "file:~jan/file" or "file://~jan/file" for a file in the home directory of user "jan".

In the "fetch" attribute the string "%file%" can be used where the path of the local target is to be inserted. This is useful when several files have a common directory. Similarly "%basename%" can be used when the last item in the path is to be used. This removes the path from the local file name, thus can be used when the remote directory is called differently and only the file name is the same. Examples:

        :attr {fetch = ftp://ftp.foo.org/pub/foo/%file%} src/include/bar.h

Gets the file "src/include/bar.h" from "ftp://ftp.foo.org/pub/foo/src/include/bar.h".

        :attr {fetch = ftp://ftp.foo.org/pub/foo/src-2.0/include/%basename%}
                          src/include/bar.h

Gets the file "src/include/bar.h" from "ftp://ftp.foo.org/pub/foo/src-2.0/include/bar.h".

Defining Your Own Method

To add a new fetch method, define a Python function with the name "fetch_method", where "method" is the word at the start. The function will be called with four arguments:

dicta dictionary with references to all variable scopes (for expert users only)
machinethe machine name from the url: what comes after the "scheme://" upto the first slash
paththe path from the url: what comes after the slash after "machine"
fnamethe name of the file where to write the result

The function should return a non-zero number for success, zero for failure. Or raise an IOError exception with a meaningful error. Here is an example:

    :python
        def fetch_foo(dict, machine, path, fname):
            from foolib import foo_the_file, FooError
            try:
                foo_the_file(machine, path, fname)
            except FooError, e:
                raise IOError, 'fetch_foo() failed: %s' % str(e)
            return 1

Note that a version control function overrules a fetch function. Thus if "foo_command()" is defined "fetch_foo" will not be called.

Caching

Remote files are downloaded when used. This can take quite a bit of time. Therefore downloaded files are cached and only downloaded again when outdated.

The cache can be spread over several directories. The list is specified with the $CACHE variable.

NOTE: Using a global, writable directory makes it possible to share the cache with other users, but only do this when you trust everybody who can login to the system! Someone who wants to do harm or make a practical joke could put a bogus file in the cache.

A cached file becomes outdated as specified with the "cache_update" attribute or the $CACHEUPDATE variable. The value is a number and a name. Possible values for the name:

daynumber specifies days
hournumber specifies hours
minnumber specifies minutes
secnumber specifies seconds

The default is "12 hour".

When a file becomes outdated, its timestamp is obtained. When it differs from when the file was last downloaded, the file is downloaded again. When the file changes but doesn't get a new timestamp this will not be noticed.

When fetching files the cached files are not used (but may be updated).