Cron job on Mac OSX that saves files to Dropbox

My work machine is on all the time, and I figured it might work well for simple tasks like text file processing:
– Take input file
– Do something with it (add information, transform text, add columns, etc)
– Save file (either overwrite old one, or create a new output file)

I used Dropboxâ??s shared folder functionality to let multiple people submit their files for processing. Your Mac will see those shared folders as local, and if you set up a cron job, you are pretty much done.

Here are some steps that can help:

  • Write your script
  • Create a folder on Dropbox. Share it with people who will need access
  • Edit your crontab. Hereâ??s a great detailed blog post on how to do this
  • Confirm itâ??s working
  • Sit back and relax, as your work is done :)

Note: remember to use absolute paths for your scripts and if youâ??re passing file/folder paths as parameters, make sure those paths are also absolute.

Hereâ??s my example:

0,30 8,19 * * 1-5 python /Users/[name]/Dropbox/script.py 
/Users/[name]/[folder]/*.csv

This runs the script every half hour, between the hours of 8am and 7pm, Monday-Friday and processes all .csv files in the designated Dropbox folder.

And here’s a simplified Python script, maybe some will find it helpful.


import csv
import sys
import os.path

for filename in sys.argv[1:]:
    
  # don't need to process my example file 
  # or anything with the _out in the file name, 
  # since my script creates them     
  if (filename.find('example.csv') == -1) and (filename.find('_out')== -1):  
  	
  	# also check if script already ran 
        # and you have _out.csv file saved in your folder
  	if os.path.isfile(filename.replace('.csv', '_out.csv')) == False:
          with open(filename) as f:
	    input = csv.reader(open(filename, 'rU'), delimiter=',', quotechar='"')
	    output = csv.writer(open(filename.replace('.csv','_out.csv'), 'ab'), 
                     delimiter=',',quotechar='"', quoting=csv.QUOTE_MINIMAL)
		
	       rowcount = 0
		
		for row in input:
		  if rowcount ==  0:  # skips the first header row
		       output.writerow(row)
		       rowcount += 1 
		   
		  else:
			# do your file processing here
			output.writerow(row)
			rowcount += 1

Sometimes the solution is easier than you think

I’m lazy. But in a good way. The way that makes you simplify, automate, and get rid of unncessary work.

Here’s a quick example. Our insights team had a task at hand that could not be done manually. Writing a script to do it took half an hour. They were happy. Once in a while they would send me a file to be processed, I run a simple command, and the script spits out the file with results, which I send back.

But that’s too much work. How do I remove myself out of the picture and let them handle it all? I figured let’s use Dropbox. Users drop their files into a shared folder, the script checks every so often if anything new was placed there, then processes the files and saves results.

Somehow in the beginning I got too entangled in details, thinking about a server where I’d put the script (probably Heroku), then having to add Dropbox API integration, then making sure all the dependencies are installed… Then it occured to me: my work machine already sees the Dropbox directories as local folders. Why not just run the cron job on my work machine and be done with it?

So with a little script tweaking, instruction writing and cron job testing, this is done, and I’ve just removed a task from my list (however simple it might be).