Git directory outside working directory

I have an old PHP website that I wanted to have its code/content versioned with Git, normally Git setup the repo directory in the “.git” directory inside the working directory, but I faced a problem, if the working directory is accessible from the web server it means “.git” is also accessible too.

Luckily git have an option to have the repo directory located somewhere else using the GIT_DIR environment variable, so here what I did:


$ vi .profile
GIT_DIR=/home/rayed/my_website_git
GIT_WORK_TREE=/var/www/my_website
export GIT_DIR
export GIT_WORK_TREE

You notice that the web site is located in “/var/www/my_website” but the repo is located under totally different directory “/home/rayed/my_website_git”, so the web server can’t access it by mistake.

Faster file syncing with Redis

The problem
In alriyadh.com most of the site administration take place inside the premises of Alriyadh Newspaper offices, and as you can image the bandwidth dedicated to the website team isn’t that big. This why we designed our system to have two parts, one inside Alriyadh Newspaper internal data center where the local access is very fast, and another part accessible to the public hosted in MeduNet, and we would have a database replication for the website data, and file system replication for the web site images and media.

Current Solution
MySQL replication is straight forward and very easy to setup, and since we are mostly replicating textual data, the replication is relatively fast and reliable, given that you have a reliable Internet connectivity.

For images replication we used “rsync” that run from a cron job, rsync is very efficient with bandwidth consumption, but it had to a scan a large number of directories to see which files aren’t synced, and this usually very slow process, and this why it would take couple of minutes for some images to appear.

Redis to the rescue
Redis is key-value store, it is similar to memcached in many ways, especially being super fast and very suitable for caching, it also have tons of extra features that makes it really interesting.

The feature that caught my eyes is “Publish/Subscribe”, basically a client would “publish” a message to a given channel, and all other client who “subscribed” to the same channel would receive the message, if no one “subscribed” no harm is done and the publishing succeeded, if multiple client “subscribed” they will all get a copy of the message.

The plan was to have the CMS (written in PHP) publish a message with a name of any file changed, and fast_sync daemon (written in Python) would collect these files and copy them individually using “rsync” to the remote server, instead of syncing a whole directory.

Conclusion
faster file system syncing of course … you forgot the title already!
Seriously, the file syncing now takes few seconds, instead of couple of minutes or more, we could also do image thumbnail generation before syncing inside fast_sync daemon, and if everything fails we still have cron+rsync

Code
From CMS in PHP, this should be integrated into the CMS file manipulation code:

< ?php
$r = new Redis();
$r->connect('127.0.0.1', 6379, 0.5);
$r->publish("new_files", "/path/to/www/htdocs/img/test.jpg");
?>

“fast_sync” in “Python”, this should run all the time, may be start it using “nohup fast_sync.py &”

#!/usr/local/bin/python -u

import redis
import time
import thread
import os


files = []
lock = thread.allocate_lock()


def send_files():
    global files, lock

    while 1:
        lock.acquire()
        my_files = list(files)
        del files[:]
        lock.release()
        if len(my_files) > 0:
            command="rsync -azv %s www:/path/to/www" % (" ".join(my_files))
            os.system(command)

        time.sleep(2)


def main():
    global files, lock, base

    # Start file sending thread
    thread.start_new_thread(send_files, ())

    # Start redis queue listener
    files = []
    r = redis.Redis(host='localhost', port=6379, db=0)
    r.subscribe("new_files")
    for msg in r.listen():
        # add file to sending queue
        if msg["type"] != "message": continue
        file = msg["data"]
        print "LISTENER: Adding ", file
        lock.acquire()
        files.append(file)
        lock.release()
        time.sleep(.1)

if __name__ == "__main__":
    try: main()
    except KeyboardInterrupt: print "Stopped by user"

TIOBE Programming Community Index

TIOBE Programming Community Index

The TIOBE Programming Community index gives an indication of the popularity of programming languages. The index is updated once a month. The ratings are based on the number of skilled engineers world-wide, courses and third party vendors. The popular search engines Google, MSN, Yahoo!, and YouTube are used to calculate the ratings. Observe that the TIOBE index is not about the best programming language or the language in which most lines of code have been written.

Delphi in the top 10, cool!

Position
Apr 2008
Position
Apr 2007
Delta in Position Programming Language Ratings
Apr 2008
Delta
Apr 2007
Status
1 1 Java 20.529% +2.17% A
2 2 C 14.684% -0.25% A
3 5 (Visual) Basic 11.699% +3.42% A
4 4 PHP 10.328% +1.69% A
5 3 C++ 9.945% -0.77% A
6 6 Perl 5.934% -0.10% A
7 7 Python 4.534% +0.72% A
8 8 C# 3.834% +0.28% A
9 10 Ruby 2.855% +0.06% A
10 11 Delphi 2.665% +0.33% A