Return Fulfilled or Rejected Promise

Today I ran into a situation where one of my functions was part of a chain of promises and I had a condition where if something was true, I didn’t want to fetch some data asynchronously, but if it was false I did want to fetch the data. My initial code looked similar to this:

var i = 0;

function getDataAsync() {
    var def = Q.defer(),
        j = ++i;

    console.log('fetching ' + j);
    setTimeout(function () {
        console.log('done ' + j);
        def.resolve();
    }, 1000);

    return def.promise;
}
 
function doSomething(willGetData) {
    if (willGetData) {
        return;
    }
    return getDataAsync()
        .then(getDataAsync)
        .then(getDataAsync);
}
 
function doSomethingMore(willGetData) {
    console.log('start');
    doSomething(willGetData)
        .then(function () {
            console.log('finish');
        });
}
 
doSomethingMore(true); //Cannot read property 'then' of undefined

You’ll notice the problem lies in doSomething where I am simply returning undefined. The error occurs in doMore because then is not a property of undefined. I needed to return a fulfilled promise to keep the chain unbroken. My initial solution looked something like the code below, but it just felt awkward, like there was too much going on. Too much extra code and…yuck.

function doSomething(willGetData) {
    var def = Q.defer();

    if (willGetData) {
        def.resolve();
        return def.promise;
    }
    
    getDataAsync()
        .then(getDataAsync)
        .then(getDataAsync)
        .then(def.resolve, def.reject);

    return def.promise;
}

So I read through the Q documentation and found a more elegant solution.

function doSomething(willGetData) {
    if (willGetData) {
        return Q();
    }
    return getDataAsync()
        .then(getDataAsync)
        .then(getDataAsync);
}

Calling Q(value) where value isn’t a promise returns a fulfilled promise with value. Conversely, if you needed to return a rejected promise with a value you could use Q.reject(value).

As a fun side note, I also toyed with this hack, which I don’t recommend:

function getThenable() {
    return {
        then: function (func) {
            func();
            return getThenable();
        }
    };
}

It could be used in place of Q() because it returns a ‘thenable’ object.

JSFiddle anyone?

Return Data in JavaScript Promises

As I was coding promises in a new module the other day, I wondered if I could return data in my anonymous function passed to then. Previously, it seems like I almost always returned data via resolve (i.e. def.resolve(data);). I probably just overcomplicated things and had too many promises involved. At any rate, I was almost sure what I wanted to do would work, but to test it in isolation I wrote up the functions below.

function createPromise(data) {
    return Q(data);
}
 
function doSomething() {
    var otherData = [1, 2, 3];
    return createPromise(['a', 'b', 'c'])
        .then(function (data) {
            console.log(data); //['a', 'b', 'c']
            return otherData;
        });
}
 
function doSomethingMore() {
    console.log('start');
    return doSomething()
        .then(function (otherData) {
            console.log(otherData); //[1, 2, 3]
            console.log('finish');
        });
}
 
doSomethingMore();

Some explanation and observations:

First. If you were unaware, calling Q(value) where value is not a promise returns a fulfilled promise with value.

Second. The createPromise function isn’t really necessary. I could just as easily substitute createPromise() in doSomething with Q(['a', 'b', 'c']). However, functions are often a bit more complex and I wanted the example to illustrate that this could very well involve 3 separate functions.

Third. then returns a promise such that promise2 = promise1.then(onFulfilled, onRejected, onProgress). promise2 is not fulfilled until the appropriate onFulfilled or onRejected function has finished running.

Fourth. You could just as easily nest a function that returns promise3 and promise2 would not resolve until promise3 had been fulfilled or rejected. For example:

function doSomething() {
    var otherData = [1, 2, 3];
    return createPromise(['a', 'b', 'c'])
        .then(function (data) {
            console.log(data);
            return createPromise(otherData);
        });
}

Most of these details (if not all) are explained in the Q API documentation. Take a gander.

JSFiddle if you like.

Overpromising with Multiple Promises in JavaScript

When dealing with multiple promises in JavaScript, it’s easy to “over-promise” or create too many promises. For example, let’s say you have a function that needs to return a promise because it also calls a function that returns a promise:

function doSomething() {
    var def = Q.defer();
    createPromise()
        .then(function onFulfilled() {
            console.log('first then');
            runTimeout();
        })
        .then(function onFulfilled() {
            console.log('second then');
            runTimeout();
            def.resolve();
        });
    return def.promise;
}

Instead, this function could be tightened up to look like this:

function doSomething() {
    return createPromise()
        .then(function onFulfilled() {
            console.log('first then');
            runTimeout();
        })
        .then(function onFulfilled() {
            console.log('second then');
            runTimeout();
        });
}

You don’t need to create and return a new promise. Just return the promise from the function that creates a promise. Sometimes, you can’t get away with this, but often you can.

This example also demonstrates how then returns a promise and resolves that promise only after the onFulfilled (or onRejected) function is executed. I didn’t use any onRejected functions to keep things simple.

Now, to give it more context with a contrived working example:

var i = 0;

// some function that does both synchronous and asynchronous stuff
// does not return a promise nor does it need to
function runTimeout(defObj, time) {
    time = time || 2000;
    console.log('timeout start.');
    setTimeout(function () {
        console.log('timeout finish.');
        if (defObj) {
            defObj.resolve();
        }
    }, time);
}

// some function that creates and returns a promise
function createPromise() {
    var def = Q.defer();
    console.log('createPromise ' + ++i);
    runTimeout(def);
    return def.promise;
}
 
// some function that executes a function that returns a promise
// also needs to do stuff after the promise is resolved
function doSomething() {
    return createPromise()
        .then(function onFulfilled() {
            console.log('first then');
            runTimeout();
        })
        .then(function onFulfilled() {
            console.log('second then');
            runTimeout();
        });
}

// another function that executes a function that returns a promise
// lots of nested promises at this point
function doMore() {
    console.log('start');
    doSomething()
        .then(createPromise)
        .then(function () {
            console.log('finish');
        });
}
 
doMore();

Test it in this JSFiddle.

You could argue that the functions/promises in this (naive) example could be ‘flattened,’ but when you are developing modules with dependencies on other modules, you can’t always flatten your functions into one chain of promises in one function. Furthermore, you wouldn’t want to because you’ve designed your modules to encapsulate specific logic (i.e. single responsibility principle) and that’s good.

If you’re really digging into JavaScript and promises then you ought to read the Promises/A+ specification for exact details on how then works. It’s a bit scary when you first look at it, but spec is outlined logically so it’s pretty easy to understand if you take the time.

JavaScript Function Declarations vs Function Expressions

I recently revisited the differences between Function Declarations (FD) and Function Expressions (FE). I wanted to see if there were any compelling reasons to use one over the other, or if both were considered acceptable. I’ve taken most of my thoughts from a blog post by Angus Croll. It’s worth a read if this is a new topic for you.

In general, FDs and FEs are interchangeable because they both create a function. However, there can be some interesting gotchas (as always with JavaScript).

Even though all browsers handle Function Declarations within non-function blocks (e.g. if), they are technically prohibited. The caveat is that each browser handles them in its own way.

function a() {
    if (false) {
        function b() { return 'Will it return?'; }
    }
    return b;
}
console.log(a()); // result depends on browser

A Function Declaration is scoped to itself and its parent. As a result, you can call it recursively and in the parent (it would be mostly useless otherwise). For example:

function minusOne(positiveNumber) {
    console.log(positiveNumber--);
    if (positiveNumber > -1) {
        return minusOne(positiveNumber);
    }
}
minusOne(2);   // 2 1 0

A Function Expression works just as well:

var minusOne = function (positiveNumber) {
    console.log(positiveNumber--);
    if (positiveNumber > -1) {
        return minusOne(positiveNumber);
    }
};
minusOne(2);   // 2 1 0

You could make the FE above a Named Function Expression (NFE) by naming the anonymous function subtractOne and it would still work. Furthermore, you could replace the recursive call to minusOne with subtractOne and it would work (assuming you named the anonymous function subtractOne). The following, however, would NOT work:

var minusOne = function subtractOne(positiveNumber) {
    console.log(positiveNumber--);
    if (positiveNumber > -1) {
        return minusOne(positiveNumber);
    }
};
subtractOne(2);   // subtractOne is not defined

You cannot call the named function outside of its assignment.

Here’s another function I wrote just for fun:

function moveToZero(number) {
    var absNumber = Math.abs(number);
    console.log(number);
    if (absNumber-- !== 0) {
        return moveToZero(number > -1 ? absNumber : ++number);
    }
}
moveToZero(2);  // 2 1 0
moveToZero(-2); // -2 -1 0

Function Expressions are more versatile. They are the essence of functional programming. You can create Function Expressions using anonymous functions. You can assign functions to objects as properties or, more specifically, assign them to prototypes. Immediatley Invoked Function Expressions (IIFEs) are considered Function Expressions. As Croll points out, currying and composing use FEs.

One of the caveats to Function Expressions is that functions are often created using anonymous functions, which can make debugging a pain. As a work around you can use Named Function Expressions, as seen in the examples above. However, NFEs are not supported in non-modern browsers (IE8 and below). Always something to keep in mind.

Many developers steer away from Function Declarations because they can be confusing. There aren’t many, if any, times you cannot replace a Function Declaration with a Function Expression. FEs are often favored for consistency and versatility.

Difference Between bin and sbin

Ever been curious about the difference between bin and sbin? The ‘s’ in sbin means ‘system’. Therefore, system binaries reside in sbin directories.

As you may have noticed, there are a number of different bin directories in Linux. The best reference I’ve found for an understanding of various Linux folders is man hier. It provides a brief explanation of the Filesystem Hierarchy Standard (FHS) in Linux. I’ve included a summary of the various bin and sbin definitions below:

/bin
    This directory contains executable programs which are needed
    in single user mode and to bring the system up or repair it.

/sbin
    Like /bin, this directory holds commands needed to boot the 
    system, but which are usually not executed by normal users.

/usr/bin
    This is the primary directory for executable programs. Most
    programs executed by normal users which are not needed for 
    booting or for repairing the system and which are not
    installed locally should be placed in this directory.

/usr/local
    This is where programs which are local to the site typically
    go.

/usr/local/bin
    Binaries for programs local to the site.

/usr/local/sbin
    Locally installed programs for system administration.

If you want to create your own scripts and make them available to all users, you’re pretty safe adding them to /usr/local/bin. If you want to run scripts using cron or crontab, simply use the full path to the command (i.e. /home/user/command).

What I do is add my scripts to my local bin (~/bin) and then I create a symbolic link in /usr/local/bin to the commands I want to make public. As a result, I can manage all my scripts from the same directory but still make some of them publicly available since /usr/local/bin is added to $PATH.

Backup or Sync Remote Files Using rsync

I wrote a shell script the other day to sync remote files using rsync. Thought I’d share it since it took me some time to get it exactly how I wanted.

rsync -rtvP --delete --include=$PATTERN* --exclude=* -e "ssh -i $SSH_KEY -p $SSH_PORT" $USERNAME@$DOMAIN:$SERVER_PATH/ $BACKUP_PATH/ 2> $ERROR_LOG

Per the man page, rsync is:

a fast, versatile, remote (and local) file-copying tool

It can also synchronize folders, so it’s more than just a file copying tool like scp. Furthermore, rsync has a significant number of options, so the documentation is quite lengthy.

To explain the code snippet above, I’ll start with the options in order of use and why I used them. You’ll also notice that I used $VARIABLES throughout the script. The definitions of these variables (among a few others) were included in the original script, but their values were both private and irrelevant so I’ve simply excluded them.

-rtvP

The -r or --recursive option allows you to recurse folders and specify them as the source or destination. Don’t forget to add a trailing slash (/) to your path.

The -t or --times option preserves the modification times on files when transferred. It’s often appropriate and preferred to use the -a or --archive option which is the same as using -rlptgoD. These options combined will recurse, copy symlinks as symlinks, and preserve permissions, modification times, group, owner, device files and special files (respectively). Perfect for archiving, but not what I wanted at the time.

The -v or --verbose option just causes rsync to be more ‘chatty’ and tell you what it’s doing.

The -P option is the same as adding --partial --progress. In essence, rsync will keep partial files if the transfer is interrupted and tell you the progress of the file transfer via standard output (your terminal screen…unless you redirect it). I wanted both these options so I chose -P.

–delete

Delete any files from the destination that do NOT exist in the source. There are a variety of other delete options to pick from should you need them.

–include=$PATTERN* –exclude=*

The --include and --exclude options take patterns that are matched against files in the source. The source folder included a number of files; however, I only wanted files that matched a specific pattern. In this case, all the files I wanted were prepended with something like ‘backup’, so that’s the value I assigned to $PATTERN. The filenames also included variable data like a timestamp, so in addition to the prefix I used the wildcard (*) to match any suffix.

If I hadn’t added the --exclude option, I still would have transferred all the files from the source folder. --include only explicitly says what should be included. It is NOT exclusive. Thus, I added --exclude=* which matches all other files. These filter rules are executed in order and build on one another. Theoretically, you could use multiple --include and --exclude options as needed. man rsync for more info.

-e “ssh -i $SSH_KEY -p $SSH_PORT”

The -e or --rsh=COMMAND option allows you to specify what remote shell want to use. I believe ssh is the default on most distributions. However, I also wanted to specify the private key I would use to authenticate with the remote server and what port I would use. -e allows me to specify these configurations.

$USERNAME@$DOMAIN:$SERVER_PATH/

The source path. Since it’s on a remote host, I’ve specified credentials and the hostname. Notice the trailing slash for my folder.

$BACKUP_PATH/

The destination path. Notice the trailing slash for my folder.

2> $ERROR_LOG

I chose to redirect all errors from sterr to a specific document.

Final Thoughts

If you want to test the command to make sure it works, just add the --dry-run option. I highly recommend it.

I’d also recommend creating a shell script file where you can define all your variables. It makes your script more readable and easier to edit in the future. Then you can add the script to your personal bin of scripts.