Ever want to turn a laptop into a webcam surveillance monitoring tool, or use a USB webcam to take pictures every 5 minutes to record a timelapse video? Or maybe you just want to monitor your room remotely and on demand? Here’s a good weekend (or day) project:

  • Installing Linux/Ubuntu on a laptop computer

  • Setting up SSH so I could remotely connect to the computer

  • Make sure Linux detects the webcam

  • Trying various command line webcam tools (fswebcam, ffmpeg, MPlayer, VLC)

  • Configuring crontab (creating a cronjob) to run every 5 minutes or hourly

  • Viewing the pictures in a web browser

Installing Linux (Ubuntu) on my laptop


asus-1201n-netbook

Over the weekend I decided to turn one of my old laptops into a Linux server. My old netbook, the ASUS 1201N was a perfect candidate since netbooks are designed to run with extremely low power consumption. Though the AC adapter for the ASUS 1201N is rated to output a maximum of 40 watts, I used my handy dandy Kill-O-Watt meter to measure the real power usage. Turns out it’s typically only 20W at idle, and around 25W under load. While this isn’t as ideal as a Raspberry Pi (which can run on only 5W) the difference in cost isn’t too crazy.

5 Watts for 1 month = 3.65 kilowatt hours at current prices = $0.32 / month
20 Watts for 1 month = 14.6 kilowatt hours at current prices = $1.28 / month

kill-a-watt-measure-power

So I am happily paying $1.28 per month in electricity to have my very own Linux server running 24/7, complete with a 1.60 GHz dual core Intel Atom N330 CPU and 2 GB of memory. Quite a bit more than the Raspberry Pi, which can be helpful for doing video and image encoding.

How do you install Ubuntu? You just go to the Ubuntu website and download it. I actually used the Windows bootable USB utility which made the process pretty simple. Note that the desktop version of Ubuntu does not come (by default) with an SSH server (more on that below), but its pretty easy to set up. The server version of Ubuntu can come with an SSH server if you select it during the install process.

Installing SSH and Configuring SSH Open Server

If you want to be able to connect to your new Ubuntu machine remotely, you’ll need to install and have an SSH server running. You can do this if you have the desktop version of Ubuntu too. (I downloaded the desktop version). It’s pretty simple to get an SSH server:

sudo aptitude install openssh-server

That’s it! For more information on configuring SSH, see this excellent openSSH server guide that the Ubuntu community put together.

I changed the default port number from 22 to a random higher number to make it a little harder for random people out on the internet to try to log in. (They would still have to guess the username and password anyways, but this helps too). Speaking of which, if you’re behind a router (like if you live at home), you’ll need to set up port forwarding so you can log in remotely. Some more information about port forwarding can be found here.

How does Linux detect your webcam?

Well if it’s a USB webcam, plug it in. If it’s an integrated webcam built into the laptop, there’s nothing to plug in. Ubuntu should automatically detect and install drivers for the webcam. To see if its detected, let’s run some commands:

Here’s a command to see if any video device nodes exist:

stephen@ubuntu:~$ ls -l /dev/video*
crw-rw----+ 1 root video 81, 0 Mar 18 20:29 /dev/video0
crw-rw----+ 1 root video 81, 1 Apr 2 08:03 /dev/video1

And here’s another command to find out about the devices:

stephen@ubuntu:~$ lsusb
Bus 001 Device 003: ID 046d:0990 Logitech, Inc. QuickCam Pro 9000
Bus 001 Device 002: ID 13d3:5111 IMC Networks Integrated Webcam
Bus 004 Device 002: ID 0b05:1788 ASUSTek Computer, Inc.
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub

More information on lsusb can be found by reading the lsusb ‘man page’ (otherwise known as the manual). You can also use the lsusb command to learn more about the resolution of the webcams. Just change the Bus and Device numbers that you found above.

stephen@ubuntu:~$ lsusb -s 001:002 -v | egrep "Width|Height"
wWidth 640
wHeight 480
wWidth 800
wHeight 600
wWidth 1024
wHeight 768
wWidth 1280
wHeight 720

fswebcam, ffmpeg, MPlayer, and VLC

There are a few command line tools that will let you take a picture using your webcam. I’ve tried three different tools and found that I liked fswebcam the most, but I’ve listed all of the options here:

Note that you might need to change your device from /dev/video0 to perhaps /dev/video1! Check the above section to see what webcam is detected.

ffmpeg

ffmpeg -f video4linux2 -i /dev/video0 -vframes 1 test.jpeg
ffmpeg -f video4linux2 -s 640x480 -i /dev/video1 -vframes 1 /home/stephen/webcamphotos/$(date +\%Y\%m\%d\%H\%M).jpeg

fswebcam

MPlayer

mplayer tv:// -tv driver=v4l2:device=/dev/video0:width=640:height=480 -frames 3 -vo jpeg

VLC

vlc -I dummy v4l2:///dev/video0 --video-filter scene --no-audio --scene-path /home/stoppal/test --scene-prefix image_prefix --scene-format png vlc://quit --run-time=1

Some notes about the Logitech Quickcam Pro 9000

logitech-quickcam-pro-9000

If you are using the Logitech Quickcam Pro 9000 it has an advertised maximum resolution of 1600×1200. Let’s try to run that with fswebcam.

stephen@ubuntu:~$ fswebcam -d /dev/video1 -r 1600x1200 --jpeg 85 -F 5 /home/stephen/webcamphotos/$(date +\%Y\%m\%d\%H\%M).jpeg
--- Opening /dev/video1...
Trying source module v4l2...
/dev/video1 opened.
No input was specified, using the first.
Adjusting resolution from 1600x1200 to 960x720.
--- Capturing 5 frames...
Captured 5 frames in 0.40 seconds. (12 fps)
--- Processing captured image...
Setting output format to JPEG, quality 85
Writing JPEG image to '/home/stephen/webcamphotos/201304051633.jpeg'.

Wait a second! Why did it adjust the resolution to 960×720?

It turns out we need to force it to use a YUYV palette instead of the default

stephen@ubuntu:~$ fswebcam -d /dev/video1 -p YUYV -r 1600x1200 --jpeg 85 -F 5 /home/stephen/webcamphotos/$(date +\%Y\%m\%d\%H\%M).jpeg

Configure crontab (make a cronjob) to take a picture every minute or hour

Crontab is a popular *nix utility that executes a command on a user defined interval. Maybe you just want to take a picture every minute, or maybe you want to shutdown your computer Monday through Friday at 10pm. Or maybe you want to run some scripts that backup your data once every 3 months. If you want to run multiple commands you can do so by chaining them with the && keyword, but it’s also sometimes worth making a bash script (or maybe a simple Python/Perl/Ruby script) that gets executed as part of the cronjob.

To view the current cron jobs for the current user, type crontab -e

Here are some cronjobs I have set up:

# To take a picture every minute
# */1 * * * * streamer -f jpeg -s 1024x768 -o /home/stephen/timelap/$(date +\%m\%d\%k\%M).jpeg

# To take a picture every hour on the 15 minute mark using a different tool
# 15 * * * * fswebcam -r 1024x768 --jpeg 85 -D 4 -F 10 /home/stephen/webcamphotos/$(date +\%Y\%m\%d\%k\%M).jpeg

# Take a picture and upload it to the webserver every hour
@hourly bash /home/stephen/scripts/take_photo_and_push.sh

The last cronjob calls a bash script that looks like this:

#!/bin/bash
#Take a picture, then push it to a remote webserver

#Take a photo
fswebcam -d /dev/video1 -p YUYV -r 1600x1200 --jpeg 85 -D 2 -F 15 /home/stephen/webcamphotos/$(date +\%Y\%m\%d\%H\%M).jpeg

#Navigate to the directory
cd /home/stephen/webcamphotos/

#Find the most recent jpeg
NEW_JPEG=$(ls -t | grep '\>.jpeg' | head -1)

#Push it to the remote webserver
scp /home/stephen/webcamphotos/$NEW_JPEG stephen@netinstructions.com:/home/stephen/netinstructions.com/homeserver/latest.jpeg

For more information on cronjobs and crontab, take a look at this guide.

Viewing/Transfering the Pictures

Okay, so you found a command line utility that takes pictures, and perhaps a cronjob that runs that command every 5 minutes or 10 minutes or every hour or once a day, but how do you look at the picture?

There are a couple of ways of doing this. If you’re using the desktop version of Ubuntu (with a nice graphical user interface) you just double click on the photo. For the rest of us who are SSH’ing in to a remote machine or are using the server version of Ubuntu or some other Linx distro, we have a few options:

  • FileZilla to grab the files and transfer them to our local machines
  • If you have a web server (Apache, ngnix, or something else) on the server, move the file to the web directory
  • SCP the file to a remote web server. For example, I have a few websites (such as this one) hosted by Dreamhost, and they provide shell access

The command to securely transfer a file on one machine to another looks like this:

scp /home/stephen/webcamphotos/$NEW_JPEG stephen@netinstructions.com:/home/stephen/netinstructions.com/homeserver/latest.jpeg

Let’s take a moment to talk about technology in the consumer space. A few years ago, an exciting little device started popping up. At first it lurked around on the some small tech blogs and action sport forums. Then it moved onto mainstream technology blogs, and once the ball started rolling, I started seeing it on television commercials. I don’t even own a television, yet the few times I found myself near a television this thing seemed to show up on every commercial break. Towards the end of last year, there was one GoPro video titled, tagged, and uploaded every minute on YouTube. How did this little device explode into popularity?

Let’s talk about what these little ‘GoPro’ cameras do:

  • Shoots video (in a wide variety of different resolutions and framerates)
  • Shoots still images (in a wide variety of delays, timers, and bursts)
  • Shoots underwater, in the dirt, in the rain, in space, or just about anywhere you want.

See a theme here? It’s crazy flexible. When I was debating purchasing one I thought about all the different use cases and what sort of audiences this camera was being marketed to. It’s great for documenting action sports. It’s great for people who just want to take some pictures or video underwater. It’s great for people who want to add a high-quality video recording unit to their remote controlled airplane. I was thinking it’d be awesome for some time-lapse video because of the built-in intervalometers. Some expensive DSLR’s don’t even have built in intervalometers!

a camera you want to love made by a company you can't help but hate

Let’s talk about features:

  • Rugged
  • Cheap
  • Simple to use

And let’s elaborate on the simple to use part. There are two buttons (at least on the second generation unit I had) — Power and Mode. There’s no camera focus to deal with. There’s no LCD screen for people to fiddle with composition, and the wide-angle lens means if the camera is pointed at your subject, it’s going to be in the shot. The ruggedness factors into the simplicity as well, because I can throw it in my backpack, tie it to a kite, use it in the rain or the leave it near the swimming pool. It requires little thought or worries. It’s a brilliant, flexible, easy-to-use, inexpensive device. But I am worried.

Though You Had Strong Hardware and Insane Viral Marketing Success, your Software and Customer Support Failed.

(And Failed in the Most Ridiculous, Absurd, How-The-Hell-Did-This-Get-Past-Your-QA-Department Way)

What’s my complaint? Well, as mentioned I was planning on doing some time lapse photography using the built in intervalometer that came with the camera. I was also going on a family vacation the next week, and wanted to try out the underwater video recording in the pool and ocean. So I ordered their newest (at the time) $300 GoPro Hero HD, a chest strap, and a few days later was happily recording video and shooting pictures. Looking at the footage, the quality impressed the hell out of me. I finally understood why everyone wanted HD televisions. The time-lapse footage came out pretty great as well. A big, wide-angle shot, perfect for watching clouds and shadows race across the the screen. But little did I know that with each of these experiments my camera was slowing ticking away, it’s life and usefulness decreasing with every single shot.

gopro-tied-to-a-kite

Yup, at one point I tied it to a kite.

Fast-forward to the end of my vacation. I shot probably 50 or so videos of my family jumping off the diving board into the water and thousands of still images that would later become some neat time lapse segments  I was getting slightly less than ideal battery life which I thought was an okay compromise for all the great shots, but what was really starting to get on my nerves was that I couldn’t seem to take more than a couple time lapse photos. I’d set it to the timed image mode, press the shutter button, and it would take a couple photos (I could see the little red light blink), but then it would mysteriously stop. Also, the 3 digit LCD display seemed to be stuck at 999. I saw this before when doing extended time lapses. It meant I had taken over 999 videos or pictures, exceeding the LDC display limit, but when plugged into the computer, I would find my files rolled over like GOPR0999.jpg and then GOPR1000.jpg, GOPR1001.jpg, etc.

Turns out that although it was smart enough to roll over after 999 photos, it wasn’t smart enough to roll over a few more times. After some hours of troubleshooting (which included connecting it to the computer, transferring files, reformatting the SD card, trying different SD cards, power cycling the camera, etc) I gave up. I couldn’t get it to take any more photos or video. Thankfully it was the last day of my vacation. Although that meant I would miss recording my four year old cousin finally get up the nerve to swim across the pool without floaters, I would have the next couple days alone to get my new camera working again.

Eventually I felt helpless. I decided to shoot GoPro an email (on 7/21/10):

Hello I recently purchased a GoPro HD Hero and it worked fine for a few weeks. Now however when I try to set up an image sequence (set to once every 5 seconds), it takes about 8 or 9 photos and then locks up. I cannot turn it off or stop its capture. When I plug it into my computer, I see that there are hundreds of empty folders. Only in the first folder are there about 8 or 9 photos.

I tried powercycling it by taking the battery out and putting it back in. I also wipe out the memory card both via the computer, as well as selecting the ‘delete all’ option on-camera. Whenever I try to do an image sequence, it just locks up after the first 8 or 9 photos. I am guessing it just starts making empty folders every 5 seconds on the memory card at that point.

I was happy with the product initially, but this is a very frustrating problem. I have missed opportunities to take awesome image sequences. Please advise what I can do about this.

Stephen

The next day I received this response:

Hi Stephen,

Do you take many time lapse pictures? What may have happened, is you may have encountered a known issue with our current firmware, in which the camera is no longer able to save files after it has taken 9999 images. Could you please let me know what the name of the last successfully captured image was? If this is the case, we would need you to send in your camera, and would reflash your firmware to fix the issue.

Otherwise, what brand/specification of SD card are you using? We have had many users facing issues with less reputable SD card manufacturers, and this could potentially creating the issue. In house we use Kingston and Patriot brand Class 4 or higher SD cards, and can fully recommend these.

Please let me know if this helps.

Many Thanks,
GoPro Support

to which I replied:

Yes, I do take many time lapse pictures. That was one of the reasons why I purchased the camera not too long ago. The last successfully captured image filename is GOPR9999.jpg in the folder 100GOPRO and there are a lot of empty folders such as 101GOPRO and 102GOPRO etc.

I have a Kingston SDHC 16GB Class 4 memory card, which I do not believe is the issue.

Is the only solution to send this camera to you to reflash the firmware? Is this something I cannot do if you send me the firmware and instructions? Is there not a reset button on the camera somewhere?

This is a frustrating experience that the firmware will not let the user take over 9,999 pictures. I do not understand how this is a “known issue” and yet you still shipped me a camera that has such an arbitrary limit to the number of pictures. It is unfair to your customers to sell a camera that only takes XX number of pictures before it needs to get shipped back (on the customer’s dime) and “reset” due to a software error.

Stephen

And here was the last email from them:

Update for Case #57639 – “Picture sequence not working”

Hi Stephen,

I’m sorry for the inconvenience. You are indeed experiencing an issue with our current firmware, in that its internal file counter cannot exceed 9999 and so is unable to save past this point. The only immediate fix we have is to reflash the firmware, which we need you to send the camera in to accomplish.

We are currently completing testing on our latest firmware release, which will fix the issue you are facing. We hope to have this released and available on our website by the end of summer, hopefully sooner, but need to make absolutely sure that image quality in all the new features is preserved. If you are in no hurry, you may elect to wait for this firmware upgrade, and will be notified of its release if you sign up for our newsletter:

http://www.goprocamera.com/newsletter

Please let me know how you would like to proceed.

Many Thanks,
GoPro Support

And that was it. They wanted me to either pay $20 in shipping to send my new camera across the United States, wait a week without it, just so they could “reset” a software bug and I could take another 9,999 photos. Or my only other option was to sign up for their marketing campaign so I could be notified of a new firmware release which was expected to be due by the end of summer. In troubleshooting this issue and browsing online forums, I noticed that users were complaining of repeated firmware release delays that the GoPro website had boldly advertised for.

What did I do? I quietly cursed at GoPro, paid the $20 in shipping and insurance, spent a week without my new camera, and silently vented my anger and frustration. Well, that was until now, with the publication of this blog post, nearly two years later.

Even though this happened long ago and the GoPro craze has only grown, looking back it still rubs me the wrong way. This is a product that I wanted to love, but made by a company that I cannot trust, and one that seriously shit on its customers.

You do not sell a fucking camera with a software-defined limited number of shots. 

Not to a few hundred people, not to a few thousand people, and in no way should you be selling hundreds of thousands of cameras around the world with a critical firmware or software bug that arbitrarily limits the number of photos you can take with it. But GoPro did exactly that. And even though I’ll probably get a next generation Hero at some point soon, I’ll be calling them assholes in my head as I click the order button, but that’s only because there isn’t a viable alternative to be seen on the market. Not yet.

But according to the latest reviews on Amazon, it doesn’t look like I’m alone at all.

Update on 3/21/2013

And if you thought it was absurd of them to sell a camera with a ‘cap’ to the number of photos you could take, they recently tried to use the Digital Millennium Copyright Act to forcibly remove a negative review of one of their products on a website. Yup, they’re ethically challenged monsters. They’re not getting a dime from me ever again.

Since I’m such a devout Google user, I often have GMail, Google Calendar, and Google Drive (formerly Google Docs) open in three different tabs on Google web browser – Google Chrome. I’m often switching between tabs. Here’s what it looks like when I switch from each one on Google’s re-branded user interface:

showing the inconsistencies in Google's new navigation header

These sorts of tiny details make me go crazy! The logos are different sizes, the buttons are different sizes, and the width of the sections are different sizes. There’s also different spacing and indentations.

It makes me wonder what happened. Google tirelessly optimizes its user interface based off A/B testing of millions of users. Is it perhaps that I am in the “A” test group for GMail, and in the “B” test group for Drive? Some people think this is fantastic. Although we’re getting there, this is still not what I would call a unified user interface.

Update 6/5/2012: Looks like it’s just a matter of changing the settings from cozy, compact, to comfortable, which doesn’t carry over from service to service.

Let’s take a quick count of all the ways we can send instant messages via Google products:

  1. The chat bar in Gmail
  2. The chat bar in Google+
  3. Google Talk for mobile phones
  4. Google Voice
  5. Gmail SMS
  6. Google Messenger (in the mobile app for Google+, formally known as Huddle)

Some of these sound pretty similar. Why am I listing the chat bar in Gmail as a separate item as the chat bar in Google+? Because they are separate. They hold different contacts. As an example, here are the two chat screens, logged in from the same account and captured at the same moment in time.

Gmail chat vs. Google+ chat

Why aren’t these the same!? I’m sure someone at Google has a good reason for this, but it still leaves me in an awkward position when I want to chat with someone. What if I am browsing Google+ and have the urge to instant message my friend about the latest picture of a hat-wearing cat that he just shared? Taking a quick look at the chat sidebar I see that he’s not online. Oh, too bad, I think to myself. But a few minutes later I am switch back to Gmail and see that he is online. What the heck? Why do I have more friends to talk to on Gmail than I do on Google+? Some of the chats carry over from Gmail to Google+ (such as if I have two browser tabs or windows open, one on each service) so it seems like they are… sort of integrated? Maybe?

Okay that is weird. What about the Google Talk app?

Let’s think of another common situation: I’m on my phone’s Google Talk app to chat with friends, and I want to share a picture. I use Google Talk instead of traditional SMS text messages very frequently. But when I am sending text messages, I have the option of sending MMS picture messages. This is all handled very seamlessly and isn’t too big of a deal. But when I am using my phone’s Google Talk app to chat, how the  heck do I send a picture? Let’s think. I could:

  • Send an email with an attachment and tell the person to check their email
  • Send a MMS and tell the person to check their phone
  • Share it on Google+ and tell the person to visit my profile page
  • Start a conversation in Google Messenger (formerly Huddle, part of the Google+ App) and share the picture and tell the person to check their Google+ Messenger App

Why is this so difficult? An easy choice would be to send an email with an attachment. After all, if the person is on Google Talk they are probably on Gmail, right? Maybe not so, as you can have the Google+ page open to chat, without having a Gmail page open. So they might miss the email. It’s also not a good choice because many smart phones take 5 or even 8 megapixel pictures. The image that you want to share might be anywhere from 300 KB to 1.5 MB in size. By the time the person finally receives the email with the picture attached, it might be 20 minutes later, and it won’t have any context. There’s just no simple way of sharing pictures via Google Talk.

Does that mean we should all be communicating via Google Messenger (formerly Huddle) — the latest and greatest way to communicate?

Google Messenger adds the ability to send group messages and share pictures, besides offering basic one to one instant messaging. Unfortunately you can only use Google Messenger on your mobile phone. Which means I am definitely not going to use it while I am near a computer where I have a large screen and an actual keyboard. And I doubt my friends and contacts will be using it either. It also has no integration with Google Talk or any of the other ways to send instant messages through Google.

So… Google Voice?

Now take a minute to consider sending SMS messages to each other through Google. If you sign up for a Google Voice you can get a special phone number. You can access Google Voice from a computer with its web browser interface, or you can access Google Voice on your mobile phone through the Google Voice app. Sounds nice, right?  Unfortunately Google Voice offers no way to send a picture or a MMS equivalent.

The other problem is that all of your friends will probably need to add a second phone number to your contact information, or create a second contact entry for you. What’s even more confusing is when your friends go to text you back. Here’s what I have to think about each time I try to message my friend Nick:

Texting the same contact

Sending a text to the same contact.

And okay, whatever, I have two entries for Nick. But I have another friend Joe who also uses his regular mobile number to text as well as Google Voice to text. Maybe they use Google Voice for 95% of the time they talk, but sometimes their cell phones may be in a weird reception spot where they don’t have 3G access (data access) which is necessary for Google Voice to work, but they have basic GSM service and are able to send SMS text messages. Or maybe they use Google Voice 95% of the time, but want to send a picture message or MMS equivalent. Nope, can’t do that in Google Voice. So every now and then I get messages from the same person but may show up on my phone as coming from two different sources and the conversations may be completely unrelated. A couple hours later I have to ask myself: Which conversation thread do I reply to? Should I respond to the Google Voice number? Should I respond to the regular number?

All of this really begs the question.. Why aren’t any of these instant message services integrated with each other? How hard can this be?

I’m really not a big Apple fan, but look at what they’re doing with iMessage. It seems like they understand what’s going on. Google on the other hand keeps introducing new ways to do the same thing thing (with small functional improvements) yet seems to have no desire to make them play nicely with one another.

And please tell me what they were thinking with this Gmail Labs Feature:

Sending a text message from Gmail (labs feature)

Oh a neat Google Labs feature! I can send text messages from Gmail?!

And here’s what the recipient sees:

Texting back to Gmail

Which really makes me wonder what is going on. I have a Google Voice number, so why not send the text from that number? Instead, it’s some random number that seems to generated on the fly. And what is the recipient supposed to do? Text a message back to a random number? Or look, there’s an email address. Should the recipient email the person back? What if the sender walks away from their email? They wouldn’t have access to that particular gchat anymore, nor would they necessarily have access to an email.

Here’s just another Google product that fails to integrate with the rest. Granted this is a “beta” and “labs” feature, but all of these things really make me wonder what is Google’s preferred way for us to send instant messages to each other? It seems like each of these products offer some neat functionality that the others lack, but they all  fall short of delivering one easy go-to solution. And that’s kind of weird, because having all of my messages in one easily accessible spot seems to align very well the company’s corporate mission:

Google’s mission is to organize the world‘s information and make it universally accessible and useful

All of this is really unfortunate because I really want to use Google products and services. I like chatting in Gmail, and I tend to use Google Talk on my mobile phone just as often as traditional SMS text message. So Google, please, for the love of this tech-savvy world,  integrate some of your chatting services.

This past weekend my hard drive failed. Saturday night I attempted to turn on my computer and received the all-too-familiar message “DISK BOOT FAILURE, INSERT SYSTEM DISK AND PRESS ENTER” which is quite annoying. I immediately suspected my OCZ Agility 2 60 GB solid state drive, only because the same thing happened exactly 7 months and 1 day ago. My BIOS wouldn’t even recognize the hard drive, which means it failed pretty hard.

What I had to do exactly 7 months and 1 day ago involved contacting the OCZ customer support website and creating an RMA support ticket. The folks there were not very quick to respond, nor very sympathetic, but I did manage to get an RMA, mail my SSD back and get a “working” SSD about a week later. That drive failed this past weekend, meaning I’ll be contacting OCZ support and jumping through their hoops once again (ugh..). So let’s just recap how my excellent experiences with these fancy SSD’s that seem to be all the rage in this modern world:

SSD 1 Purchased on 10/5/2010 Failed on 4/3/2011 Age at death: 6 months 2 days
SSD 2 Replaced on 4/11/2011 Failed on 11/12/2011 Age at death: 7 months 1 day

The results are pretty scary. I could fill up 60 GB of data on that drive and lose all of it every 6 or 7 months. And I did. And it sucked. Am I just unlucky? Did I get two lemons?

Apparently not. Looking at Jeff Atwood’s collection of SSD lifespans, it would appear that solid state drives fail pretty often. Here’s the numbers he found based on eight SSD’s purchased over the last 2 years:

  • Super Talent 32 GB SSD, failed after 137 days
  • OCZ Vertex 1 250 GB SSD, failed after 512 days
  • G.Skill 64 GB SSD, failed after 251 days
  • G.Skill 64 GB SSD, failed after 276 days
  • Crucial 64 GB SSD, failed after 350 days
  • OCZ Agility 60 GB SSD, failed after 72 days
  • Intel X25-M 80 GB SSD, failed after 15 days
  • Intel X25-M 80 GB SSD, failed after 206 days

Scary! I have about 8 regular magnetic hard disk drives, some of them over 5 years old, and not one of them has failed me. Is the speed that SSDs bring to the table worth the risk that comes with losing 60 GB of data twice a year? In my situation, yes. Here’s what my storage looks like (an incredible diagram, I know):

As you can see, the only data that I lose when my SSD fails every 6-7 months is the operating system (Windows 7 Professional) and some commonly used program files (Eclipse, Notepad++, Adobe Lightroom, Adobe Photoshop). All of these can be replaced relatively easily. In fact, I keep the software installers on my 1 TB mirrored hard drives. That’s where I store my precious data (photos from a 7 year time span, important documents, saved game files, etc). Windows 7 has a neat disk management tool that allows me to set up mirrored hard drives without too much thinking. The other hard drives that aren’t mirrored (and therefore have no redundancy if they were to fail) contain data that would suck to lose, but wouldn’t cause me to cry. All of my movies and TV shows can be (slowly) replaced if I really wanted them again, program files can be reinstalled, and the misc files such as  rendered compositions (I do some video editing and CGI work from time to time) can be re-rendered.

So why does my situation still suck? And why is the title of this article a shout-out to Puppy Linux? Because once my SSD dies, I lose my operating system. Without the operating system, I lose easy access to all of these files. My data is safe, but I can’t access it. When the drive my OS lived on died, my first instinct was to just re-install the OS somewhere else. But where? All of my hard drives had stuff on them. Stuff that I could lose if necessary, but I didn’t want to resort to that. If only I could just get basic access to the files and move some things around… Hmm…

That’s where Puppy Linux totally saved the day. I was able to download a 125 MB disk image (.iso) and place it on the USB drive to create a bootable USB “disk”. I stuck this USB drive into of my computer and in my BIOS screen, selected the USB drive to boot from. Within a few minutes I had a fully functional operating system (a variant of Linux) which allowed me to see my hard drives and files. If I so desired, I could grab those important documents that were safely backed up on one of my hard drives and transfer them to a USB drive. However my goal was just to delete what wasn’t important and relocate what was mildly important to another drive, thereby freeing up an entire HDD so I could install Windows 7 onto it. If I didn’t have any desire to do some gaming I could have just used the Puppy Linux OS (running from the USB drive) for the next week or so while I wait for OCZ to send me a replacement SSD.

Just a quick note: Downloading the disk image (.iso) file from Puppy Linux and dropping it onto your USB drive will not work. You’ll need to insert some magical code onto your USB drive so your computer can “boot” from it. You’ll also want to “unzip” or decompress the .iso file and actually transfer the contents of that disk image onto your USB drive instead of the .iso file. Disk images are actually great for burning onto a CD, or “mounting” onto virtual CD hardware, but when it comes to making a bootable USB drive they need a tiny bit of manipulating.

I tried a bunch of different ways of inserting that magical code that allowed the USB drive to be “bootable” without much luck. Fortunately I found UNetBootin to streamline the entire process of creating a bootable USB drive and it worked perfectly for me.

Hard drive death suck. OCZ is a jerk. Here's my set up.

Although I didn't really need to open up my computer, I wanted to make sure that my SATA cable or port wasn't the culprit. Indeed it was the SSD. I also disconnected my "precious data" completely so I wouldn't accidentally reformat that.

Next time your operating system or a hard drive fails, consider booting from an OS that lives on your USB drive. You’ll be able to access your hard drives and recover your  files so long as they are not corrupted.

So I just want to say a big thank you to Puppy Linux, a big thank you to the fine folks who wrote UNetBootin, and a disappointing “ughhhhhghghghggh” to OCZ who have twice been unsympathetic and very slow to help me with their faulty solid state drives.

Update 6/5/2012 – Further Reading

Months after publishing this post, I have found this recent article describing the inner workings of SSDs to be very enlightening. After reading it, I’m surprised OCZ even offers a 2 year warranty.

Are you teaching yourself how to program and write code? Or do you have extra time to kill and want to brush up on your skills? Did you already make the “Hello World!” program in your language of choice, and maybe a Fahrenheit-To-Celsius converter, and now you’re wondering what to do? While I was just starting out in the world of programming I often ran into this hurdle. Along the way I found some great resources and sources of inspiration to keep me going. Whether you’re hoping to become a game programmer, write the next mobile app, or just want some practice, I think any of these exercises will help. Even advanced programmers may find some of the resources helpful (or they may already use them!)

All of these suggestions are language-agnostic. You can (and should) use whatever programming language you are comfortable with. I’m not suggesting that you do all of the following. These are just some ideas– figure out what works best for you! I list the following exercises because this is what worked for me. I completed all of  following goals and walked away with a lot of satisfaction and knowledge. I hope you can too.

Some simple programming exercises after Hello World

I’ve heard many people say, “I want to make a website that let’s you make other websites. Should I learn PHP jQuery or should I learn HTML? Can I make a site like Yahoo in Dreamweaver?” Another question I’ve seen is “I want to become a game programmer. What’s a good book for XNA programming that is for beginners?” Or, “I have this idea for an iPhone app. How do I make it?”

My reaction: Woah buddy! Slow down!

Let’s just get this straight: Programming is (often) long, difficult, and very involved. Many games, mobile apps, and custom websites take months or possibly years to make. If you try to dive headfirst into creating the next hit mobile app or the latest and greatest website single-handedly you’ll get frustrated and discouraged very quickly. You’ll also probably make a lot of mistakes or end up going down the wrong path.

For beginner programmers who would rather not get discouraged my advice is to keep it simple. If you plan on making the next big thing you need to know how to do the small things first. So if you can make a Hello World application, can you also make a game of tic-tac-toe?

Write (in your language of choice) a text-based game of Tic-Tac-Toe

This is actually the first step towards making the latest and greatest website or mobile game/app. “But how does that make sense?” you might wonder. By completing this exercise you will surely ground yourself in the reality that is programming. You need to know how to read input in from a user, perform some logic on it, and spit out some new information back to the user. You need to understand how to keep repeating a loop until a goal is met. If you aren’t able to perform these relatively small tasks, good luck making the next mobile app. However this project is pretty simple and makes for a great next-step exercise. Once you complete this you’ll feel accomplished and ready to take on a more complicated project. Here’s the project goal:

  1. Assume the user (player) is an X and the computer is a O
  2. Ask the player where they want to place their X
  3. The computer places an O
  4. Output the 9 tiles showing where each player went
  5. Ask the player where they want to place their X
  6. Repeat until a winner is determined

Sound pretty simple? Go and make it! Then play it. Try and break it. Put an X where an O is. See what happens when there is a “Cat” game (no winner). Also, when I mean text-based, here’s what I am suggesting:

Write (in your language of choice) a text-based game of Hangman

Now that you have a working game of Tic Tac Toe, let’s try Hangman. The underlying programming that you’ll have to do will be fairly simple. Take what you’ve learned in creating Tic Tac Toe and use it to write really awesome code.  Did you plan out your game of Tic Tac Toe before you started coding? Did you make functions, objects, or methods (or was it all in one big function?) Did you comment your code often and use meaningful variable names? If you didn’t do any of that while creating Tic Tac Toe, start doing it while you’re working on Hangman.

Here’s the steps:

  1. Pick a random 5+ letter word, maybe from a list.
  2. Ask the player to guess letters
  3. If they guess correctly, show the progress.
  4. If they guess incorrectly, tell them how how close their person is to being hung.

Here’s what I would imagine a text-based version looks like:

Write (in your language of choice) a GUI based game of Hangman

Now that you have a basic idea of programming and can write the underlying game logic it is probably time for a more user-friendly interface. Even if you plan to make a game engine yourself or hope to forever write console-based programs, learning how to work with a a GUI (or any framework for that matter) is a vital skill for programmers. Aspiring programmers have to learn how to use other people’s code or frameworks and get their code to work nicely with it. This is something that comes up again and again. Software programming is not about reinventing the wheel every time you need to do something. For this reason I would suggest learning about what sort of GUI kits are available for your language of choice. For Java, you might want to check out Swing. For Python, you might want to check out TkInter. Do some research in which GUI frameworks are available for your language. However, don’t stress too much about which framework to use. You’ll be building a fairly simple game (and you already wrote the underlying game logic).

The key here is that you’ll gain some experience with some of the more advanced topics of your language. You might not have had to worry about inheritance or objects and classes up until this point. However sooner or later you’ll need to learn how to read the documentation for another framework and implement it correctly. You may realize that your game logic isn’t as nice or modular as it could be. Or you may realize that you didn’t do enough testing on one component of your software and now when you put everything together you can’t figure out where the error is. You may even learn about design patterns (gasp!), such as the observer pattern that most GUI’s are based upon. Regardless, this is often a very challenging step but a necessary one if you want to succeed as a programmer. It’s hard to name even the most simple of software that wasn’t built without other frameworks. Minecraft was built upon the LWJGL (which is actually built upon OpenGL and OpenAL), 3d Sound System, and JOrbis frameworks. Namebench was built upon Python, Tkinter, PyObjC, dnspython, jinja2 and graphy. Bioshock was built upon UnrealEngine 2.5 and Havok Physics. You have to know how to make your code work with other code.

I’ll also throw in one more suggestion for making your Hangman game more exciting and fleshing out your skills– try interfacing with a random word generator API to get a word before each game. For more information on using an API, see my tutorial here.

Write (in your language of choice) a simple 2D platformer

Now that you have the experience of writing simple game logic and the experience of working with a GUI framework you can move onto something more personal. A simple 2 dimensional platformer should have enough constraint and simplicity that you don’t get overwhelmed and discouraged, but also enough flexibility to allow yourself  to make this your project.

What if your end goal is to make a website? Or the next mobile app, or a search engine or a web crawler? Why a platformer game? Because programming games makes you think about writing good code. It forces you to think about the best data structure or the best algorithm. Game programming is not entirely different than programming for something else. If your game is running slowly you know your code isn’t that great. A simple FPS (frames per second) counter allows you to continuously benchmark your progress. And best of all, programming a game from scratch is a lot of fun.

But remember, keep it simple! Start with really small goals:

  • Create a stationary red square and display it on the screen.

Once that is taken care of, try adding animation. Here you’ll have to determine how your programming language or GUI framework deals with threads and animation. (Hint: If you’re using Java you’ll probably implement a Runnable.)

  • Have the red square move and bounce around on the screen.

Now once you have figured out basic animation you can add code so the red square responds to keyboard input. Here you’ll have to determine how your programming language or GUI framework deals with keyboard input. (Hint: If you’re using Java you’ll probably implement a KeyListener).

  • Instead of random bouncing movement, have the red square respond to keyboard buttons. (It moves up when the Up button is pressed, left when the Left button is pressed, etc).

Now step back and look at your code. Give yourself a pat on the back. Those three simple goals might not have turned out to be so simple after all. The good news is that now you can start customizing your game. Let’s learn how to deal with File I/O and manipulating PNG images.

  • Draw a small 20 by 10 pixel image and save it as a .png file.
  • Have your game read in this image and display it instead of the red square.

If you are developing on a Windows machine, I would recommend using Paint.net to draw the image instead of the Microsoft Paint that comes with every copy of Windows. I also recommend saving it as a .png file because most programming languages have libraries that allow you to manipulate .png files more easily (and especially on a per-pixel level which may be very helpful down the road). Once you figured out how to read in and display a custom image, let’s make this platformer feel like a platformer:

  • Determine how you’re going to store and display the level.
  • Make the level.
  • Display the level.

What does this mean exactly? Well, a platformer game is composed of platforms. Your player (the picture you just drew) will move around and interact with this platform. Maybe there are death spikes somewhere on the platform, or maybe just grass, or maybe there is snow. Regardless, you need to have a level that contains “blocks” or elements that the player interacts with. There are a couple ways of representing this level– You could store the entire map and keep track of every single pixel, or you can break up your level into discrete elements (i.e. one block is 5 x 5 pixels). I prefer the latter approach because it allows you to create levels quickly and reuse components. Here are two ways of storing the level data:

The text file approach:

The .png file approach (100% magnification):

The same .png file (at 400% magnification):

Notice that these two approaches contain the same sort of data. Either way you will have to write code that reads in a file (probably the .png or the .txt file), loops through the data, and builds the level. Here’s what the level looks like after it translates the 54 x 40 element level into a 638 x 480 pixel game (each “element” is a 12 x 12 pixel block).

As you can see this is quite a simple game which was the goal of the project. Once you have figured out how to create, store, and render out a level you can start having fun. Here are some next steps (in no particular order) to practice and develop your programming knowledge:

  • Create different blocks (a sticky block, death spikes, spawn points)
  • Try adding gravity
  • Figure out how to implement collision detection
  • Add another level. Determine how you will “load” levels or switch to the next one. Does the “camera” follow along, or does the level change when you move to the edge of the screen?
  • Put a measurable metric such as a Frames Per Second (FPS) counter somewhere in your game. How will you determine your frames per second?
  • Create different blocks (a sticky block, death spikes, spawn points)
  • Add other entities to the game such as enemies
  • Make your static graphics move. Animate your character walking around.

Again these are just some ideas. I’ll leave it up to you to figure out what you want your game to look like and to define your own goals. You’ll quickly realize that this simple game may not be so simple after all. You’ll also start realize the advantages of object-oriented design, or maybe why Flash ActionScript is a popular programming language for developing games in (animations, collision detection, graphic manipulations are all very trivial to perform in ActionScript). However all of these things will make you a more knowledgeable and skilled programmer in the end if you actually code them yourself.

As you struggle to make this “simple” game please realize that you are not alone. You will get stuck (and often). When I was working on this game and got stuck I gave myself a little break. Take a walk, get some coffee, surf the internet. Another good option is to see what the community is up to. Maybe someone else is making a very similar game and you can look to them for inspiration (Hint: Here is Minecraft Developer Notch’s simple platformer in Java, as well as his source code (.zip) (which he released openly for Ludam Dare)). Take a look at what others are doing and how they handled the problems that you are facing.

Join a community (ask questions and answer questions)

Whether you are taking a break from your programming project (and you should take breaks!) or you just woke up and aren’t quite ready for brain puzzles, a good middle ground is a programming community. Some suggestions are Stackoverflow, Reddit’s /r/learnprogramming (for beginners), /r/programming (for advanced topics or news), or one of the many other communities that exist to bring people together and share knowledge. The important thing to remember as you struggle along on your journey is that you are not alone. All of the questions that you have or problems that are will run into have likely been brought up and discussed somewhere online. Take advantage of that!

Another perk of belonging in a programming community is that you’ll often discover new ways or different approaches to tackling a problem. It also keeps possible solutions or technologies on your radar. Somewhere down the road you may have to do some specific task and thanks to frequently skimming the Q & A’ s of a community forum you may have already read people’s past experiences, problems, and solutions.

It can also be said that by teaching or explaining something you will be a better master of your craft. This is exactly why I started this website. The purpose of this website isn’t just to teach you. My goal is to become a better writer, communicator, and programmer. If you can explain a concept to a complete stranger in a concise fashion and have them walk away knowledgeable then you will succeed in life. Your colleagues and friends will like you more. You will do better in job interviews. Your boss will appreciate you more. Communication is incredibly important for just about any job, and programmers are generally terrible at it. Take a look at some of the most successful scientists, engineers, and programmers of our time. Are they good communicators? You bet so.

Join a community (such as a game competition)

Another community to look into are one of the many game competitions that exist out there. Even if you have never made fully working software, a game, or an app before you probably know enough to make something. Even if your game is as simple as a two button, poorly drawn animation loop, it’s still an awesome accomplishment (no offense to Shadow’s game, I think it’s great).  Just remember, keep it simple. Some of the most successful games and software are often the simplest.

There are so many reasons why you should enter and participate in a game competition. You’ll learn a lot. You will go through all of the stages of game/software development– brainstorming, prototyping, creating, bug hunting, and publishing. You’ll get a lot of publicity. People will actually play your game and give you feedback. And best of all, at the end of it, you can look back and point to a completed project. Just how good does that feel? Answer: really good. Instead of talking about all of your good ideas, this will actually force one out of you.

If you’re unsure of which ones to join, here are my suggestions: Ludam Dare has an amazing community of fun, respectful, and sometimes famous developers (this competition can be partly attributed to Minecraft developer Notch’s online following). Ludam Dare also has really fun “I’m in” videos and hosts to kick off each competition. Another competition worth checking out is the Java4K competition where participants try to develop a game in under 4 KB of Java code.

Find and solve short programming problems

Another resource worth tapping are one of many websites that provide simple programming problems. A website that immediately comes to mind is Project Euler. On it there are some 350+ problems that make for perfect little programming puzzles to work through. They are organized from the most easily solved to the ones that few people have solved. What I really found helpful was that you know when you solved the problem. There’s a box to input your guess and the website will tell you whether you got it right or wrong. Once you’ve proven your merit you will be presented with a forum to look at other people’s solutions to see how it was solved in another language or perhaps with a novel algorithm or approach. Many of the initial problems can be solved by beginner programmers with sloppy inefficient solutions, but as you progress you may realize that you brute force approach will get you the right answer eventually (but could take hours), forcing you to rethink your approach and create the optimal solution.

Other places to look for programming problems and exercises are academic courses. Many computer science professors put up all of the lectures, assignments, exams, and sometimes solutions on their course website allowing people from around the world to gain critical knowledge without paying absurd tuition rates. Some institutions deliberately open up their course information for this reason– MIT has curriculum available, and Stanford has a program as well.

Take advantage of these resources. As mentioned at the top of this article, only by programming will you gain programming knowledge. Tackle complex problems, and at all times challenge yourself. Athletes don’t stay in shape by taking a walk. Driving to work doesn’t make you a professional driver. Replying to emails doesn’t improve your typing skills. Good programmers don’t get any better by writing code that they know how to write. Challenge yourself.

Books I found helpful

And finally take a minute to remember books. For many of us who are constantly reading blogs or have gone to school so long ago we may often forget that there is a plethora of textbooks available whose sole purpose is to teach us and provide knowledge. In each field there are always the classics (such as Strunk and White‘s standard for English majors) and computer science is no exception. Many people will point to the widely regarded K & K book on C programming as required reading. For those in software development, a required reading often cited is Design Patterns written by the “Gang of Four”. And for whatever reason, computer scientists and programmers often reference Lewis Carroll’s Alice in Wonderland as well as Through the Looking Glass, so it might be a good fun book to read before bed.

With those “required readings” out of the way I will briefly mention two books that really jump started my journey into programming. The first is Larry Ullman’s PHP 6 and MySQL 5 for Dyanmic Websites which allowed me to create fun, useful, and interactive websites quickly. I had attempted to learn various other languages before picking up that book, but that was the first time that I really felt like I was learning and creating. I believe that Larry Ullman has recently (summer 2011) written a 4th edition of his book which you may want to check out at as well.

Once I had picked up the basics on programming I found an amazing Java book that I used to dive into the deeper aspects of programming. I believe that I am not alone in recommending Head First Java by Sierra and Bates (take a look at the reviews as well). Though it may not be ideal if you are just starting out (read my lessons learned as a beginner programmer here), it is definitely worth checking out after you know the basics (which is what this article is aimed at).

Disclaimer: I receive a small percentage of the sale when you buy those books (or anything from Amazon for that matter) by following one of those links listed above. It doesn’t increase the price for you, but it helps pay for this web hosting. Thanks!

Conclusion

I hope you found this helpful! If you have any tips or resources that you’ve used along the way please share them in the comments. I hope you’ve noticed that I’ve deliberately left out advice on picking a programming language and just about all of my examples and resources are accessible in whichever language you want. For my reasoning behind that, you may want to read my article reflecting on my journey into programming.

Many beginner programmers see the acronym API all over the place. Why are API’s everywhere? What do you do with them? How do they work? At the same time, many beginner programmers see or encounter XML. Why is XML everywhere? How do you turn XML into the integers or strings that I know how to deal with? These are excellent questions that aspiring programmers may ask themselves. For me it was difficult to grasp the big picture and see exactly why these two acronyms were talked about so often in the programming world. In this article I’ll explain what API’s are, why XML is so often associated with them, and at the end give a short example of how to “connect” to an API, grab some XML from it, and parse it to turn it into the integers or strings that you probably know how to manipulate on a regular basis.

So what is an API (besides an Application Programming Interface)?

Imagine you worked for a large company named Word Co. that organized words, specifically English language words. Perhaps your company scanned a bunch of textbooks and collected all of the words, counted the words, and created a big database full of useful information related to words. Basically you have a big set of information and one day your company (Word Co.) decides it wants to make all of that data available for other companies or allow individuals to see or access it. What are your options?

  • Give people the actual database
  • Make a website that pulls from the database
  • Make an API that allows programmers to interact with the database

The first option is probably not a good one because the database can be huge (potentially gigabytes or terabytes of information), you may be using a proprietary database (such as Google’s BigTable) or software, or maybe you just spent millions of dollars collecting this information and you want to charge people for accessing it.

The second option may be a really neat idea but might not work if you wanted a mobile device or app to access it, or if you wanted to present the information in a different way other than a chart or web form. Imagine if someone wanted to make a Hangman game where you try to guess a random word (maybe a random word that was pulled from the big database of English language words) before a stick figure is “hung”. This is something the website cannot directly perform.

An API allows people to grab information (or use services) that are part of a huge data set in ways that might not be imagined by the people who created that large data set. If  Word Co. organized English words and created an API to access those words, let’s take a minute to imagine what others can create with it:

Which are all applications or tools that Word Co. doesn’t have the time or desire to create. API’s are usually intended to allow third parties to create awesome things using existing data that a company has already harvested and collected. What are some other services that might have API’s?

  • Weather services usually have API’s
  • Google has a ton of API’s (like their Maps API, their search engine, and just about everything else)
  • Facebook allows third-party developers to interact with the Facebook data
  • Twitter
  • Nearly everything else

So how do I “connect to” or use an API?

Although many API’s are different, it often boils down to making a request and getting some data. Some API’s give you a bunch of code or libraries that you add to your project, and then use that code to make the requests, but many other API’s are quite simple. If you are new to programming, I’d suggest looking for REST or so called “RESTful” API’s. Other ways to access API’s such as SOAP also exist, but in my opinion are a little harder to get started with. Fortunately many API’s that used to be SOAP based are now REST based. Let’s outline how you would use a typical REST based API:

  1. Make an HTTP request to a web server. Usually you’ll include a variable or two that is passed in through the URL
  2. Get some data back (typically XML)
  3. Parse the XML (the XML is just a big character stream and you’ll want to grab certain pieces of it and turn it into other data types or create an object)
  4. Use that data to do neat things! (Like create a Hangman game with a random word you just grabbed)

Notice that the data that comes back from an API is typically XML. Why XML? Because it’s a great intermediary “language”. Imagine if you wrote your Hangman game in Java and the Random Word API gave you Python code back. That wouldn’t be very useful. Or if you wrote something in C/C++ and an API gave you a serialized Java object.

What makes XML so popular (especially with API’s) is that it allows you to use whichever language you want, and gives you data is that both human readable and computer readable. Just about any programming language comes with standard libraries to parse XML quickly and easily. If you’re an advanced programmer, it also allows you to build objects or data structures (like if you’re dealing with A TON of data) exactly how you want them instead of forcing you to accept whatever the API gives you.

A concrete example in Java

Let’s make something! Imagine you wanted to create your own Android weather app. Since we aren’t meteorologists, we’ll get all of the weather information from someone else– Google’s Weather API. Other options are the National Weather Service (in the U.S.) or maybe Weather Underground. Most of the API’s out there are well documented and tell you how you should connect, use, or interface with them. Google’s Weather API is a little weird in that there is no documentation. I think it’s sort of a secret API. But here’s how you use it:

  1. Make an HTTP request to http://www.google.com/ig/api?weather=Location where Location is whatever you want (A postal code or city).

That’s it! You’ll get a bunch of XML back with the current weather and forecast information. You can even try it out in your web browser (since your web browser makes HTTP requests on a very regular basis).  Let’s see what happens when we use Seattle WA as an example (from http://www.google.com/ig/api?weather=Seattle+WA):

<xml_api_reply version="1">
<weather module_id="0" tab_id="0" mobile_row="0" mobile_zipped="1" row="0" section="0">
<forecast_information>
<city data="Seattle, WA"/>
<postal_code data="Seattle WA"/>
<latitude_e6 data=""/>
<longitude_e6 data=""/>
<forecast_date data="2011-09-29"/>
<current_date_time data="2011-09-29 17:53:00 +0000"/>
<unit_system data="US"/>
</forecast_information>
<current_conditions>
<condition data="Clear"/>
<temp_f data="62"/>
<temp_c data="17"/>
<humidity data="Humidity: 62%"/>
<icon data="/ig/images/weather/sunny.gif"/>
<wind_condition data="Wind: N at 4 mph"/>
</current_conditions>
<forecast_conditions>
<day_of_week data="Thu"/>
<low data="56"/>
<high data="72"/>
<icon data="/ig/images/weather/sunny.gif"/>
<condition data="Clear"/>
</forecast_conditions>
<forecast_conditions>
<day_of_week data="Fri"/>
<low data="56"/>
<high data="70"/>
<icon data="/ig/images/weather/mostly_sunny.gif"/>
<condition data="Partly Sunny"/>
</forecast_conditions>
<forecast_conditions>
<day_of_week data="Sat"/>
<low data="49"/>
<high data="65"/>
<icon data="/ig/images/weather/rain.gif"/>
<condition data="Showers"/>
</forecast_conditions>
<forecast_conditions>
<day_of_week data="Sun"/>
<low data="54"/>
<high data="65"/>
<icon data="/ig/images/weather/chance_of_rain.gif"/>
<condition data="Chance of Rain"/>
</forecast_conditions>
</weather>
</xml_api_reply>

And let’s imagine we want to extract the highs and lows in this XML so we can use them in our Android weather app. As mentioned, many programming languages have built in libraries that allow you to parse the XML. Since XML is so popular, there are even multiple approaches to parsing it, even within a given language. Java has both a DOM parser and a SAX parser built in. Python also has a DOM parser and a SAX parser built in. What are DOM and SAX parsers?

  • SAX (Simple API for XML) parsers are stream oriented parsers and typically use less memory and are faster
  • DOM (Document Object Model) parsers are tree traversal parsers and can consume more memory if you’re dealing with large amounts of XML

When should you use one over the other? When you are dealing with HUGE amounts of data. Most of the time (such as right now) you don’t need to worry and can use whichever one you’re comfortable with. I’ll be using the Java SAX parser in this example.

Remember the steps to do this? 1) Make an HTTP request to the API, typically passing in a URL variable, 2) Get the data back and then parse it, and finally 3) Do neat things! Let’s see what that looks like in Java code:

Weather.java (first draft)

import java.io.IOException;
import java.io.InputStream;
import java.net.URL;

public class Weather
{

    public static final String URL_SOURCE = "http://www.google.com/ig/api?weather=";


    public static void main(String[] args)
    {
        /*** Create the request ***/
        // Let's pick a location:
        String location = "Seattle, WA";
        // Create the URL:
        String query = URL_SOURCE + location;
        // Replace blanks with HTML-Equivalent:
        query = query.replace(" ", "%20");

        /***
         * Make the request (This needs to be in a try-catch block because things can go wrong)
         ***/
        try
        {
            // Turn the string into a URL object
            URL urlObject = new URL(query);
            // Open the stream (which returns an InputStream):
            InputStream in = urlObject.openStream();

            /** Now parse the data (the stream) that we received back ***/
            // Coming shortly since we need to set up a parser

        }
        catch(IOException ioe)
        {
            ioe.printStackTrace();
        }
    }
}

So at this point we have some simple Java code that connects to the Google Weather API and receives some data back. In the above case, we are getting our data (the XML) in the form of an InputStream. In other languages you’ll still probably be receiving the data as a stream. Streams and I/O are a pretty big part of programming, so if you’re not sure how to work with these, now is a good time to start. Anyways, we now need to set up the XML parser. As mentioned I am picking the SAX parser for this example, and as the SAX parser explains on its website, you need to create a handler for handling the XML. In other words, you need to tell it what to do when it encounters specific parts of the XML. In this case we’ll look for <low>, <high>, and <day_of_week> tags. To define this behavior we’ll extend SAX’s DefaultHandler (meaning we give it more functionality than the default functionality). Let’s see what this looks like:

GoogleHandler.java

public class GoogleHandler extends DefaultHandler
{

    // Create three array lists to store the data
    public ArrayList<Integer> lows = new ArrayList<Integer>();
    public ArrayList<Integer> highs = new ArrayList<Integer>();
    public ArrayList<String> days = new ArrayList<String>();


    // Make sure that the code in DefaultHandler's
    // constructor is called:
    public GoogleHandler()
    {
        super();
    }


    /*** Below are the three methods that we are extending ***/

    @Override
    public void startDocument()
    {
        System.out.println("Start document");
    }


    @Override
    public void endDocument()
    {
        System.out.println("End document");
    }


    // This is where all the work is happening:
    @Override
    public void startElement(String uri, String name, String qName, Attributes atts)
    {
        if(qName.compareTo("day_of_week") == 0)
        {
            String day = atts.getValue(0);
            System.out.println("Day: " + day);
            this.days.add(day);
        }
        if(qName.compareToIgnoreCase("low") == 0)
        {
            int low = Integer.parseInt(atts.getValue(0));
            System.out.println("Low: " + low);
            this.lows.add(low);
        }
        if(qName.compareToIgnoreCase("high") == 0)
        {
            int high = Integer.parseInt(atts.getValue(0));
            System.out.println("High: " + high);
            this.highs.add(high);
        }
    }
}

And now that we have defined how the XML parser should behave, let’s add in our GoogleHandler to the Weather code:

Weather.java (final draft)

import java.io.IOException;
import java.io.InputStream;
import java.net.URL;

import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLReaderFactory;

public class Weather
{

    public static final String URL_SOURCE = "http://www.google.com/ig/api?weather=";


    public static void main(String[] args)
    {
        /*** Create the request ***/
        // Let's pick a location:
        String location = "Seattle, WA";
        // Create the URL:
        String query = URL_SOURCE + location;
        // Replace blanks with HTML-Equivalent:
        query = query.replace(" ", "%20");

        /***
         * Make the request (This needs to be in a try-catch block because things can go wrong)
         ***/
        try
        {
            // Turn the string into a URL object
            URL urlObject = new URL(query);
            // Open the stream (which returns an InputStream):
            InputStream in = urlObject.openStream();

            /** Now parse the data (the stream) that we received back ***/

            // Create an XML reader
            XMLReader xr = XMLReaderFactory.createXMLReader();

            // Tell that XML reader to use our special Google Handler
            GoogleHandler ourSpecialHandler = new GoogleHandler();
            xr.setContentHandler(ourSpecialHandler);

            // We have an InputStream, but let's just wrap it in
            // an InputSource (the SAX parser likes it that way)
            InputSource inSource = new InputSource(in);

            // And parse it!
            xr.parse(inSource);

        }
        catch(IOException ioe)
        {
            ioe.printStackTrace();
        }
        catch(SAXException se)
        {
            se.printStackTrace();
        }
    }
}

Doesn’t look so bad, does it? If you go ahead compile the two files (both Weather.java and GoogleHandler.java) you should be able to run it without any problems. Here’s the output when I ran it:

Start document
Day: Thu
Low: 56
High: 72
Day: Fri
Low: 56
High: 70
Day: Sat
Low: 49
High: 65
Day: Sun
Low: 54
High: 65
End document

In the GoogleHandler there are System.out.println() commands, but it also adds the integers and strings into their own array lists which you can now access in a more familiar way (such as calling days.get(0) to get the first day of the week in that array list).

A concrete example in Python 3

And finally let’s take a quick look at how to do this in Python, again using a SAX parser. As you can see, Python does quite a bit of heavy lifting for you (such as making the HTTP request and getting the XML — which is one  line of code). Go ahead and copy/modify this code for any of your projects. It was built and tested with Python 3.2.2 in October 2011.

Weather.py

import urllib.request
import xml.sax

# Create some lists to store the data:
lows = []
highs = []
days = []

# Define our special Google Handler that extends
# what the default content handler does
class GoogleHandler(xml.sax.ContentHandler):
	def startElement(self, name, attrs):
		if name=="day_of_week":
			print("Day:", attrs['data'])
			days.append(attrs['data'])
		if name=="low":
			print("Low:", attrs['data'])
			lows.append(attrs['data'])
		if name=="high":
			print("High:", attrs['data'])
			highs.append(attrs['data'])

# Make an HTTP request at the specified URL
# and get back a bunch of XML
xmlResponse = urllib.request.urlopen('http://www.google.com/ig/api?weather=Seattle+WA')

# Create a SAX Parser
parser = xml.sax.make_parser()
# Tell the parser to use our special handler
parser.setContentHandler(GoogleHandler())
# And parse the XML!
parser.parse(xmlResponse)

# Print out the lists:
print("Days:", days)
print("Lows:", lows)
print("Highs:", highs)

And let’s see what sort of output we get when we run it:

Day: Thu
Low: 56
High: 72
Day: Fri
Low: 56
High: 70
Day: Sat
Low: 49
High: 65
Day: Sun
Low: 54
High: 65
Days: ['Thu', 'Fri', 'Sat', 'Sun']
Lows: ['56', '56', '49', '54']
Highs: ['72', '70', '65', '65']

I hope this tutorial was helpful. If you have questions please ask away. I’ll also add that our fictional Word Co. (as mentioned at the top of this article) API isn’t just a made up concept to explain API’s. It actually exists!

Interested in learning to program and write code? Wondering what programming language you should teach yourself? Curious how other people got started? In this article I’ll explain how I started from ground zero, knowing nothing about programming or software development, struggled to grasp new languages and concepts, and later knew enough to get a job making custom website back-ends, writing scripts for my colleagues, and developing mobile apps. Along the way I’ll point out some helpful resources that I have found, as well as common pitfalls that I hope you’ll avoid. Let’s begin!

Which language should I learn?

A lot of people new to programming often ask this question. Depending on who you’ll ask (or where you ask it) you’ll often get a lot of different answers. When I was just starting out I asked people this question and did a lot of research. Java was something that people kept suggesting. Looking at all the features it had (like object-oriented, reusable, automatic memory management, or portable) made it sound very appealing, even if I didn’t exactly know what those words meant for a language. What ultimately convinced me was the adoption rate and the number of companies and organizations who used Java. I was thinking that if everyone else was using it, then it must be a good choice, right?

Wrong. Java was very hard to dive into. Your first, supposedly simple “Hello World” application was full of weird keywords that would take an experienced programmer a few days to fully explain. What the hell does public static void main(String[] args) even mean? Why is it necessary? What’s arguably worse about Java for beginners is that it introduces the advanced concept of objects and classes way too early in the game. Add strong typing, inheritance, and polymorphism to the mix and you’ll have beginners scratching their heads and getting discouraged. Sure those features are awesome (and almost necessary for large projects), but they are advanced topics that someone new to programming shouldn’t be too concerned about. But Java almost forces you to use them, or at least think about them. That’s why to get anything to run you have to use static void main(String[] args).

So after getting all sorts of excited to finally learn programming, I purchased Head First Java and spent a couple weeks writing my first programs. For the reasons summarized above, I was quickly discouraged and each day teaching myself programming was getting more tiresome. I took a very long break of about 6 months because I thought programming was difficult and it didn’t feel very rewarding.

What was the first language that was exciting?

I later thought it’d be a fun project to get a website up and running. After figuring out how to make a simple website (full of static HTML pages) I wanted to make a dynamic, user-interactive website which I realized would be easiest with PHP. After researching a good book, I ordered the highly-acclaimed PHP 6 and MySQL 5 for Dynamic Websites  and within a week was writing simple, powerful, useful, and fun PHP code. The book is one that I have recommended to friends and though it assumes some programming knowledge, I found the book to be an excellent source in learning PHP from the ground up.

If you’re not familiar with PHP, it is a “web language”. What does that mean? It means that it is useful for writing web pages. To put it simply, PHP exists to spit out a bunch of HTML, process forms, interact with a database, and spit out some more HTML. Some great questions to ask are: where does PHP live? Do you compile it? Do you install it somewhere? What makes PHP such a good first language was that if your web server supports it, PHP is as easy as writing a couple lines of code in a simple text editor, sticking it on your web server, and then visiting that page with a web browser. Here’s an example page. Note that most of this is simple HTML, with just a couple lines of PHP code:

<html>
<head>
<title>PHP Test</title>
</head>
<body>
<?php
echo '<p>Hello World</p>';
$count = 10;
$animal = "monkeys";
echo '<p>There are ' . $count . ' ' . $animal . '!</p>';
?>
</body>
</html>

You’ll notice that the PHP lives in between the HTML. How easy is that? (pretty easy!). To get output from PHP code you typically echo or print it out in the form of some HTML, maybe in between some paragraph <p>. You can also use PHP to create any sort of other HTML tags like dynamic <div> and layout tags, or maybe headers, or buttons, or CSS, or JavaScript. PHP stands for the hypertext preprocessor meaning it is executed and ran before the HTML is displayed to the end user.

PHP also is a great language to learn in conjunction with a database, typically MySQL (because they play so well together). Databases are a big deal, and any aspiring programmer needs to know the basics of how to interact with one. What’s great is that PHP and MySQL are really easy and approachable. You don’t need to dive into crazy advanced topics to do some neat and very useful stuff. The week or two that I spent learning about databases when I was learning PHP has been super useful with other languages and future projects.

So why was PHP such an “exciting” language?

  • It was easy to get started (no installation)
  • I didn’t need to understand advanced topics to do simple things (dynamic weak typing, objects are optional)
  • I created cool stuff quickly (a dynamic website that my friends and I used)

The last point is pretty key. If you are a beginner learning how to program, you’ll want to operate in a way that gives instant feedback and satisfaction. The other components help (no installation, no worrying about data types) but if you don’t feel like you’re accomplishing something, you’ll probably give up faster.

What was the second language that was exciting?

During my senior year of college I took my “first” formal programming course (technically third, since the first was using MatLAB to create ray-tracing programs for an optics course, and the second was Mathematica for applied boundary value problems and Fourier analysis). This class was taught by the computer science department and those wishing to learn computer science typically took it during their first year in the program. The language that was chosen for introductory students was Python.

Why Python? That was the very first lecture, and I’ll share with you the bullet points taken straight off the PowerPoint presentation:

  • —Named after Monty Python’s Flying Circus

Which probably just means the course instructor was a bit weird. But let’s look at what sort of programming assignments we had. Here’s the classic Hello World (with my addition) program in the Python IDE that comes bundled with the software installer (17 MB that unpacks to ~50 MB).

And let’s see what it looks like when it is executed (by pressing either F5 or going into Run -> Run Module)

Neat, huh? What’s also neat is that it takes all of 5 minutes to download, install, write, and run your first Python program. The software and packages are light-weight and comes with its own text-highlighting editor (IDLE) and console shell. Now let’s look at some of the other programs that our first-year class made:

  1. A text scanner to analyze large text documents (entire books) to determine word counts per sentences, word frequencies, etc.
  2. Animations and simple 2-dimensional movies by procedurally drawing and moving shapes.
  3. Sound synthesizers and audio transformation tools.
  4. A simple web crawler.

They sound pretty neat, useful, and engaging. Yet these were created by first year students, many of which never programmed before in their life. You might be thinking, “Wow, a web crawler… something that is a fundamental part of Google, Bing, and Yahoo… how the hell does a first year student make something like that?” Check out the code here– it’s under 50 lines of Python.

Python has a great community and comes with some awesome documentation. It doesn’t have a difficult installation or a write, compile, run workflow. The debugging information is usually helpful. It doesn’t force you to use objects and classes (unless you want to) or think about which data type your variables should be. At the same time, it’s powerful enough to do just about anything you wanted to. Unless you’re developing enterprise level software or extremely computationally intensive tasks (and most beginners aren’t!) you don’t need to write in Java or C/C++. In fact, Google is well known to use Python for all sorts of projects (such as this DNS benchmark utility). In fact many software engineers and research scientists are turning to Python for their projects or for rapid prototyping. You may also notice that at many of the universities and colleges in the U.S. computer science departments are frequently moving away from using Java for the first couple courses. (Two of the schools in my area use Python for the first two CS courses in a four course sequence, Java for the last two, and C/C++ or Java for the remaining advanced courses).

Let’s summarize why I found Python to be so exciting:

  • It was easy to get started (the installation was quick and comes with its own IDE, called IDLE)
  • I didn’t need to understand advanced topics to do simple things (duck typing, objects are optional)
  • I created cool stuff quickly (a web crawler among others)
Look familiar?

So what language should I learn?

Many enthusiastic and aspiring programmers still get stuck on this question. Here’s the answer: Learn whatever language works for you, keeps you excited, and is flexible enough to do whatever you want. With that in mind, I personally believe Python really worked for me. PHP got me very excited and is the language that I really learned how to program with, but it also has a pretty specific purpose (web development). Sure you can create websites with Python, but PHP was incredibly easy to get started with. And this reflects another very important point: Use whichever language is most suitable for the work that you are doing. You can use C/C++ to create a DNS benchmark software for individuals to check their speeds at home, but it’d probably be a lot faster to use Python. At the same time, you can use Python to create an algorithm to find large prime numbers, but you’d be much better off using C or even assembly language.

As a beginner don’t stress out too much about this question. The best thing you can do is pick a language, run through some tutorials, and see how you like it. The things that you’ll discover while learning your first language will all translate over to the next one. After I had learned PHP and Python I took another stab at Java, this time with the goal of creating an Android app. Within a few weeks I was pumping out code and starting to appreciate (and understand!) all the advanced topics such as objects, classes, inheritance, and polymorphism. But I wouldn’t ever have gotten it if I had just forced my way into it. Learning how to code, all the features of certain languages, and computer science in general is a process that will never end.

So what are you waiting for? Get started!

Interested to learn how Google, Bing, or Yahoo work? Wondering what it takes to crawl the web, and what a simple web crawler looks like? In under 50 lines of Python (version 3) code, here’s a simple web crawler! (The full source with comments is at the bottom of this article).

And let’s see how it is run. Notice that you enter in a starting website, a word to find, and the maximum number of pages to search through.

Okay, but how does it work?

Let’s first talk about what a web crawler’s purpose is. As described on the Wikipedia page, a web crawler is a program that browses the World Wide Web in a methodical fashion collecting information. What sort of information does a web crawler collect? Typically two things:

  • Web page content (the text and multimedia on a page)
  • Links (to other web pages on the same website, or to other websites entirely)

Which is exactly what this little “robot” does. It starts at the website that you type into the spider() function and looks at all the content on that website. This particular robot doesn’t examine any multimedia, instead it is just looking for “text/html” as described in the code. Each time it visits a web page it collects two sets of data: All the text on the page, and all the links on the page. If the word isn’t found in the text on the page, the robot takes the next link in its collection and repeats the process, again collecting the text and the set of links on the next page. Again and again, repeating the process, until the robot has either found the word or has runs into the limit that you typed into the spider() function.

Is this how Google works?

Sort of. Google has a whole fleet of web crawlers constantly crawling the web, and crawling is a big part of discovering new content (or keeping up to date with websites that are constantly changing or adding new stuff). However you probably noticed that this search took awhile to complete, maybe a few seconds. On more difficult search words it might take even longer. There’s another big component to search engines called indexing. Indexing is what you do with all the data that the web crawler collects. Indexing means that you parse (go through and analyze) the web page content and create a big collection (think database or table) of easily accessible and quickly retrievable information. So when you visit Google and type in “kitty cat”, your search word is going straight* to the collection of data that has already been crawled, parsed, and analyzed. In fact, your search results are already sitting there waiting for that one magic phrase of “kitty cat” to unleash them. That’s why you can get over 14 million results within 0.14 seconds.

*Your search terms actually visit a number of databases simultaneously such as spell checkers, translation services, analytic and tracking servers, etc.

Let’s look at the code in more detail!

The following code should be fully functional for Python 3.x. It was written and tested with Python 3.2.2 in September 2011. Go ahead and copy+paste this into your Python IDE and run it or modify it!

from html.parser import HTMLParser
from urllib.request import urlopen
from urllib import parse

# We are going to create a class called LinkParser that inherits some
# methods from HTMLParser which is why it is passed into the definition
class LinkParser(HTMLParser):

	# This is a function that HTMLParser normally has
	# but we are adding some functionality to it
	def handle_starttag(self, tag, attrs):
		# We are looking for the begining of a link. Links normally look
		# like <a href="www.someurl.com"></a>
		if tag == 'a':
			for (key, value) in attrs:
				if key == 'href':
					# We are grabbing the new URL. We are also adding the
					# base URL to it. For example:
					# www.netinstructions.com is the base and
					# somepage.html is the new URL (a relative URL)
					#
					# We combine a relative URL with the base URL to create
					# an absolute URL like:
					# www.netinstructions.com/somepage.html
					newUrl = parse.urljoin(self.baseUrl, value)
					# And add it to our colection of links:
					self.links = self.links + [newUrl]

	# This is a new function that we are creating to get links
	# that our spider() function will call
	def getLinks(self, url):
		self.links = []
		# Remember the base URL which will be important when creating
		# absolute URLs
		self.baseUrl = url
		# Use the urlopen function from the standard Python 3 library
		response = urlopen(url)
		# Make sure that we are looking at HTML and not other things that
		# are floating around on the internet (such as
		# JavaScript files, CSS, or .PDFs for example)
		if response.getheader('Content-Type')=='text/html':
			htmlBytes = response.read()
			# Note that feed() handles Strings well, but not bytes
			# (A change from Python 2.x to Python 3.x)
			htmlString = htmlBytes.decode("utf-8")
			self.feed(htmlString)
			return htmlString, self.links
		else:
			return "",[]

# And finally here is our spider. It takes in an URL, a word to find,
# and the number of pages to search through before giving up
def spider(url, word, maxPages):
	pagesToVisit = [url]
	numberVisited = 0
	foundWord = False
	# The main loop. Create a LinkParser and get all the links on the page.
	# Also search the page for the word or string
	# In our getLinks function we return the web page
	# (this is useful for searching for the word)
	# and we return a set of links from that web page
	# (this is useful for where to go next)
	while numberVisited < maxPages and pagesToVisit != [] and not foundWord:
		numberVisited = numberVisited +1
		# Start from the beginning of our collection of pages to visit:
		url = pagesToVisit[0]
		pagesToVisit = pagesToVisit[1:]
		try:
			print(numberVisited, "Visiting:", url)
			parser = LinkParser()
			data, links = parser.getLinks(url)
			if data.find(word)>-1:
				foundWord = True
				# Add the pages that we visited to the end of our collection
				# of pages to visit:
				pagesToVisit = pagesToVisit + links
				print(" **Success!**")
		except:
			print(" **Failed!**")
	if foundWord:
		print("The word", word, "was found at", url)
	else:
		print("Word never found")

Magic!

We live in 2011, complete with computers and the ever present internet and world wide web. Nearly everything has a website, but do you? This guide will attempt to explain everything you need to do, starting from scratch, to get a website up and running. Whether it is a personal website, a new business website such as a restaurant, or a complex number-crunching website such as Google, I’ll detail each step and provide enough information for you to get started. Here is a rough outline:

  • Determining your domain/brand name

  • Finding a web host

  • Registering your domain name

  • Designing the website

  • Uploading/Updating your website

  • Troubleshooting and testing

Each of those bullet points will have a dedicated section below, so feel free to skip to those sections if you’d like. You may be wondering what my motivation is be to make this guide. Well, to be honest, this is an experiment. I was sitting in front of my computer a couple years ago, surfing the web, and suddenly realized that with all these websites in the world, why didn’t I have one? I was standing in your shoes 2 years ago. That started the journey into researching just exactly how to go about it. I researched costs, web hosting servers, domain registrars, and different ways to create the actual HTML and CSS that powers a website. Along the way, I realized there was a lot of bad advice and even more false advertising. People made biased guides just to get others to sign up for their webhost and collect a profit, and other guides were just full of advertisements for a particular technology that no one needed. I’m here to give you some personal advice, as well as plenty of choices and options along the way.

I’m not trying to get you to sign up for my web host. I’m not trying to sell you some sort of EZ-Website Maker Deluxe. I’m not telling you which domain registrars to use, or which products to buy. But I will offer my advice and the lessons I learned while creating my first website.

So, let’s begin…

Determining Your Domain / Brand Name

Basically you have two routes here– do you already have a name (such as a restaurant or company that’s been in business for awhile) or are you startup company and you want one of those ambiguous, catchy “web 2.0″ names like Google or Bing?

If you are going for the latter, you’re in for some bad news (but don’t be too sad). Unfortunately for you, the world wide web has been around for some 20+ years, and domain registrations are only around $10 / year. That means someone can buy tens or even hundreds of domain names and hold on to them. Imagine a company that makes a relatively small profit of $3 million per year. They could buy thousands of domain names for an insignificant amount of cash. Now imagine Apple, Google, or Microsoft, which make billions of dollars per year, and imagine how many domain names they can simply “hold onto” just in case. Do they do that? Probably.

What I am trying to get at is the fact that domain names are fairly cheap and there is no limit to the number you can have. Many of the short, catchy, one or two syllable, “good” domain names are already sold and registered. Does that mean you can’t ever get them? No. But you’ll certainly have to pay more than $10 dollars to get one.

There are many websites out there that allow you to bid on or buy existing domain names. The prices here will vary, from as little as $30 to upwards of $500. For short 4 or 5 letter domains, you may be expected to pay thousands.

If your company or business already has a name, you might be inclined to use that as your domain name. However, there are a few things to be aware of.

Is your business name catchy? Do you have a strong brand? Do you expect people to already know your name? In that case, go ahead and use that as your domain name.

However, if your business is relatively unknown, or if you are expecting your website to bring in visitors you may want to include some keywords in your domain name. For example, I could have named this website stephensswriting.com but I instead wanted my domain name to explain my website’s purpose. That’s right, picking out your domain name is the first step in optimizing your website for search engines and discoverability. If your company was a small ceramic company called Bakerlite, you might want to try bakerliteceramics.com. The other option, www.bakerlite.com is not very helpful to search engines unless people already associated Bakerlite with ceramics. In that case, it would be wise to reinforce the brainwashing, er… I mean, association.

Let me conclude with some ideas of pricing and how to go about registering your domain name. First of all, expect to pay no more than $10/year for a domain registration. In addition to a domain name registration, you also need a web host (more on that below). Many places that host your website also offer domain registrations. You should also be able to register a domain through one company, and then host it at a different company.

Finding a Web Host

Now that you have a domain name picked out (and possibly registered) you’ll need a web host. This is essentially a “computer” connected to the internet. This “computer” is always on, stores your website (this includes text, pictures, video, HTML, CSS etc.), and accepts incoming connections (visitors) and serves them the data that they want (the text, pictures, video, HTML, and CSS etc.)

In fact, your computer right now could theoretically host your website. The problem is that you’d always have to have it turned on and have a fast connection to the internet. Additionally, you’d need some software running to receive HTTP requests. The most common software to do this is the Apache HTTP Server.

The thing is though, if you are reading this guide (meaning you’re a beginner) you probably won’t have the technical skills and resources to personally get your own Apache HTTP Server up and running. It would be much easier for someone else to do it.

There are a lot of good resources for picking out the company that will ultimately host your website. While you are doing your research, keep in mind some of these keywords:

  • Bandwidth – This is how much data flows to your visitors. If you plan on having lots of images, files, or movies on your website, bandwidth becomes important.  Many web hosts advertise “unlimited” bandwidth. There is no such thing as unlimited bandwidth, but instead the company will hope you never notice or push up against the limit.
  • Disk space – This is how much data is on your website. If you plan on having images, files, or movies, keep in mind that they take up a lot of space. Again, there isn’t really anything such as unlimited disk space. If you plan on backing up your personal hard drive onto your web host’s disk space, they’ll likely make the transfer so painfully slow that you’ll give up.
  • Uptime – This is how frequently the web server will stay on. Ideally this number should be 100% (always on). Many companies advertise 99.999% uptime, which is some statistic that is likely made up. The best way to determine how true this claim is is to ask current customers. Check the company blogs and twitter and look at customers’ comments. Are they happy?

The web host that you’ll ultimately pick is up to you. A small website for your friends and family (maybe 300 visitors/month) won’t need all the bells and whistles as website like like Google (maybe 30,000,000 visitors/month). You should realize though that if you have a small website and one day you post something that is incredibly awesome and “goes viral” on the internet, expect your website to go down. Many web hosts offer upgrade paths to allow your website to grow, but don’t expect this to happen on the hour your website suddenly becomes popular.

If you choose wrongly about your web host or you get upset at them, keep in mind that unless you signed up for an incredibly sketchy web host, it shouldn’t be too difficult to take your content and move to a different web host. Your website (the content including text, design, pictures, etc) is your website. You own it. Not your web host.

I will briefly mention that after doing quite a lot of research around the web I found that Dreamhost sounded like a good web host. It wasn’t the cheapest option out there, but it looked to be the most trustworthy. It also helped that it had a large base of satisfied customers. If you’re interested, check them out here. Disclaimer: I receive a referral bonus if you sign up for them through that link.

Registering your Domain Name

The next step (or the same step) is registering your domain name. The reason why this might be a different step is that your domain name and your web host can be two separate entities. You can register your domain name at Company X and host your website at Company Y. It might be a little tricky and there will be more steps involved if you do this, but this is an option.

As mentioned, when you sign up for a web host, they’ll often include 1 domain registration with the web hosting, sort of a bundled “package”. If they don’t include 1 registration, you can likely pay the $10 for it through the same company at the time of sign up.

You might be wondering who is in charge of all the names of all the websites that make up the world wide web. After all, isn’t the world wide web supposed to be a collection of independently created content that spans international and  inter-continental distances? Where does your $10 dollars go when you register a domain name (Who’s making all the money? And why didn’t I think of that!?).

The answer that you are probably looking for is the International Corporation for Assigned Names and Numbers (ICANN). But they don’t directly receive your $10 dollars. Instead, the ICANN delegates the tedious job of selling and registering domain names to various ICAAN-accredited domain registrars. An example of an ICANN-accredited domain registrar is GoDaddy. These second-tier registrars are the ones usually interfacing with the public (you and I) and asking for the $10 dollars in return for registering a domain name. The difference is that second-tier registrars may purchase a few million addresses (they’re buying in bulk) and resale them at a “retail” price to customers.

So now imagine that we have a domain name and a web host. The next step is…

Designing the Website

When you type in www.google.com in your web browser, what happens? Well, behind the scenes, your web browser first seeks a dynamic name server (DNS) to translate a human-readable address, such as “www.google.com” to an internet (think “computer-readable”) address of 74.125.226.176. Next an HTTP request is made to the server located at 74.125.226.176 that basically says “I want your data”. Now that you have an address (or domain name), and a server (a web host), you can send some data to people whose browsers are making HTTP requests.

What exactly is this data that comes from a web server to a client (a visitor)? Well, for the most part, it is a bunch of text mixed in with some images, perhaps a movie, or maybe an Adobe Flash game. This is what is referred to as content on a website. A website consists of content:

  • Text – You are reading a bunch of text right now, aren’t you? Other text may be interactive and include hyperlinks to pages or other websites.
  • Multimedia – Any pictures, videos, music, PDF files, Flash applications, Silverlight applications, Microsoft Word Documents that you can download.
  • Design – This includes any CSS (and accompanying HTML) of your website. CSS will be described below, but in short this is the code that describes how your text and multimedia should be presented to the end user (a visitor).

Let’s first examine what a website really looks like, before your web browser makes it all pretty. Depending on your web browser (such as Mozilla Firefox, Internet Explorer, Google Chrome, Safari, etc.) these steps may be a little different.

  • Firefox – Go to View and then select Page source. Alternatively you can right-click anywhere on a page and select View page source.
  • Chrome – Right-click anywhere on a page and select View page source.
  • Internet Explorer – Go to View and then select Source.

As you can see, there is quite a bit of text. You might see some common themes though, such as <a href=”some_address”>Some text</a> or maybe some <div id=”something”></div>. An image might look like <img src=”address_to_image” alt=”alternative text” />. Your web browser takes all of this text and renders it into a web site that is pleasing for humans to see and interact with.

What is all this code that I see? Are there some reoccurring themes?

  • HTML is very common and can be thought of all the little pieces or building blocks of a website. It describes where headers, paragraphs, links, pictures, divisions, and just about everything else goes.
  • CSS is used in conjunction with the HTML elements. CSS describes how a particular header looks, the indentation of a paragraph, or the length and width of a division to name some examples.
  • Javascript is code that performs tasks or functions for the visitor. Many websites can function without Javascript, but other websites will usually have Javascript code running to control simple things like fading images or advanced  things like asynchronous calls to a database or for formatting a website on the fly. A quick thing to note– Javascript is NOT the same thing as Java (another programming language). It’s also primarily used for client-side execution (meaning the code is run on a visitor’s computer, different than code that is running on the website’s computer or server). Code that runs on a web server (such as PHP, Python, Perl, Ruby, etc) is code that a visitor will not ever see and therefore you will not see it by looking at the page source.

Okay, so how do I create this content? Now here is where it can get complicated. There are literally hundreds (possibly thousands) of editors and website generators that you can use. You can use a pre-formed template and fill in the missing blanks. You can use a graphical editor where you drag images and word blocks around to position them on the screen. You can use a simple text editor like Notepad or Microsoft Word (though MS Word is not usually a good idea for web design). You can use a hybrid editor like Adobe Dreamweaver. You can even use an editor that is inside your web browser, such as the WordPress editor (which is actually called the TinyMCE editor). The thing to keep in mind, at the end of the day your visitor is still receiving the same HTML/CSS “data” that was described above*.

Why the *asterisk? It’s very possible that the “helper” editors such as Dreamweaver or WordPress will accidentally add in extra spaces, extra <span> blocks </span>, or occasionally refuse to format your paragraphs and content exactly how you want them to look. Most of the time this isn’t a big issue, but there are always those purists who need maximum control and love to dive into the nitty-gritty raw HTML and CSS. Many of these purists will use simple text editors like WordPad or Notepad. Let’s look at what a very basic web page looks like in one of these editors:

<!DOCTYPE HTML>
<HTML>
<HEAD>
<TITLE>Super Basic Website</TITLE>
<META name="keywords" content="Test" />
</HEAD>
<BODY>
<H1>Welcome to the super basic website</H1>
<P>Here is a paragraph on a website</P>
<div id="sidebar">
<P>Hello sidebar!</P>
</div>
</BODY>
</HTML>

If you’re adventurous, you can open up a new WordPad or Notepad document, copy and paste the above HTML into the editor, and save it something like testwebsite.html. You can then open that in your web browser (Firefox, Internet Explorer, Chrome) and see before your very eyes how the web browser takes the HTML and renders it into a web page.

At this moment I’d like to point out so-called WYSIWYG editors that stand for What You See Is What You Get which usually means you will not be working with basic raw HTML like what is shown above. Instead you’ll be editing text that is already rendered out as it would be in a web browser. Instead of seeing:

<strong>This text is bold</strong>

you’ll see something like:

This text is bold

What’s great about many WYSIWYG editors is that they usually let you switch back and forth between working the rendered mode and the raw HTML mode to get the best of both worlds. I find that it is much easier to work in a rich, full featured rendered mode for nearly everything, but when there are extra <span> blocks or indentations and lists are not working exactly as intended I can click a button and switch over to the raw HTML.

I’ll make a quick note about a raw text editors. Notepad and WordPad are suitable for writing basic, unformatted text, but aren’t the best for writing and examining code. Take a look at the same HTML in each of these two editors:

You’ll notice that everything is color coded and indented nicely. This makes editing code drastically more efficient without adding any additional complexity (a rare win-win with most new technology). Another feature of coding editors is that brackets </>, curly braces {…}, and parenthesis (…) change colors or boldness to let you know where you left one off. For these reasons I would strongly suggest using something just slightly fancier than WordPad or Notepad for editing code such as HTML. An excellent tool that I have used (and many, many others have used) is NotePad++. It’s free, light-weight, open-source, and is not any more difficult to use than a regular text editor.

Now let’s move onto something more advanced than just typing the raw HTML into a text editor– using WordPress to create content for a website. What’s neat about editing on WordPress is that you can edit your website on your website. What does that mean? It means that you open up a web browser, go to your website, press the log-in button or link, and can start typing up a new post right inside of your web browser. Notice the two buttons that let you switch between typing in a WYSIWYG editor and typing out some raw HTML. You’ll spend most of your time in the WYSIWYG editor, or what WordPress refers to as the “normal” editor.

Uploading/Updating your Website

Okay so imagine you picked out a domain name, registered it, and purchased web hosting. Now how do you put your first web page out onto the World Wide Web? Here are a couple options:

  • You can FTP/SFTP to a web server to upload a file such as about.html or index.html. FTP stands for File Transfer Protocol and SFTP stands for Secure File Transfer Protocol.
  • If WordPress or some other Content Management System (CMS) is installed on the webserver you can visit your website, log in, and edit pages through your web browser.

Let’s look at the most basic approach first– uploading a file to your web server. Imagine you created the HTML file that discussed earlier. This is a web page in its simplest form. You can name it whatever you’d like, whether that is testwebsite.html or blahblah.html. Now you need to put it onto your webserver so other people can visit the webpage and see it rendered out in their web browser.

Hopefully once you signed up for a webserver they gave you a username and password for it. Now you just need a piece of software that will connect to that web server and allow you to transfer the testwebsite.html file to it. An excellent free, light-weight, and open source tool that I use all the time is FileZilla. Here’s how I transfer a file to the webserver:

Here the file is on my desktop

Here I am transferring the file using FileZilla SFTP software

And after typing the address into any web browser, here is the new page on the world wide web!

Comparing web browsers

That was pretty simple, right? In FileZilla you can also create directories. Right-click on a folder and select Create Directory. Give it a name, maybe “about”, and now you can put web pages in various directories. For example, you can create a page called thecompany.html and thefounders.html and put them in a directory called “about”. Visit those pages by going to www.yourwebsite.com/about/thecompany.html or www.yourwebsite.com/about/thefounders.html.

One thing to note is the special name index. If you name any file as index, that is the file that will show up if no other file is specified. This works when you want to have a specific page show up with you visit your web site. If I renamed TestWebsite.html to index.html, I would visit the page by just typing in www.netinstructions.com instead of www.netinstructions.com/TestWebsite.html.

Now let’s move away from using FTP/SFTP software and a text editor to create web pages. Instead, let’s create a web page using WordPress. The following pictures and steps assumes that you already have WordPress installed on your web server.

Just visit your website by typing in the address

Visit the website

Log in with your username and password (this will be set up right when you install WordPress)

Log in

Now you can create a new post or edit an existing post by just typing into the text box. Click update or publish when you’re done.

Edit the content

Note that you can switch between looking at the raw HTML and the rendered content

Note the HTML option to look at the underlying raw HTML

As you can see, editing and adding content through WordPress is pretty simple. Many websites these days allow users to add content through software that runs on the web server and is accessible with a web browser. This lets you maintain your website on just about any device (computer, tablet, phone) anywhere in the world at any time. Other content management systems (CMS‘s) exist besides WordPress such as Joomla, Drupal, Plone, Tumblr, Blogger, and many more.

Troubleshooting and Testing

As you’re going through these steps, you may have some issues. I’ve selected a few of the more popular problems that pop up from time to time.

To FTP or to SFTP? And where to put the files?

When you want to upload files to your web server or web host the most common way to do this is with a client using the (Secure) File Transfer Protocol. It is strongly suggested that you do not use FTP and instead use the secure protocol (SFTP). It’s not difficult to use SFTP instead of FTP. For example, you’ll want to use port 22 instead of port 21 when you are using FileZilla. Many other FTP and SFTP clients will just have a checkbox or a setting to switch between the two. The reason why you do not want to use FTP is that your username and password is passed from your computer across the internet to your web server in an unencrypted and exposed way. When you instead use SFTP, your username and password is encrypted before it it sent out across the internet. If a man-in-the-middle were to intercept your transfer of packets, they wouldn’t be able to “see” your username and password.

Is it really possible for someone to “intercept” your packets? Yes, absolutely. It may not be likely, and it will probably not be a human, but there is certainly the possibility that some router that your packets travel through to reach the web server will have software or code looking for insecure FTP credentials. If you are really curious how many routers are in the middle of you and your web server, an easy trace route will show you. On a Windows machine, go to your command prompt by typing Run… and then cmd or typing in cmd to the search bar. Once you’re in the command prompt, type tracert yourwebsite.com. After a moment this will show you all the routers that your information passes through before it reaches the destination.

One thing you’ll notice when you use the SFTP for the first time is the need to accept the key initially. A warning box may appear such as shown below:

It is probably safe to trust the host when connecting to it for the very first time. If you add this key to the cache, regularly connect to the host, and one day the server’s key changes, you can start to act suspicious.

Another helpful point to make is where you’ll want to stick your web pages. Most web servers are running Linux (the operating system) and Apache (the software that listens for and allows incoming HTTP requests). On those machines you’ll typically want to stick your web pages and content at the site root. Some site roots might look like:

  • /home/user_name/yourwebsite.com/stickyourpagehere.html
  • /home/www/yourwebsite.com/stickyourpagehere.html

DNS settings and how to change them!

DNS (Dynamic Name Server) settings are typically associated with your domain name. When you own your domain name you should be able to change some of the DNS settings if you so desire (and most of them time you probably wouldn’t unless you were manually setting up email services, pointing your domain at a new host, setting up Google Apps for domains, or adding subdomains). But I’m including this here so you know they exist. Here are some fields and possible values (a complete list of DNS record types are here):

  • A is for mapping a hostname (think domain) to an IP of the host
  • MX is for use with mail exchange
  • TXT is for a simple textual message and you could theoretically put a random message here, but why would you?
What are your DNS settings and record types? There are lots of web tools out there that let you see them. Let’s see what the settings are for netinstructions.com:
  • MX 10 ASPMX.L.GOOGLE.com means Google is handling my mail. This is because I have Google Apps for domains because I really like GMail to handle all of my email needs.
  • MX 20 ALT1.ASPMX.L.GOOGLE.com is backup in case the first email exchange server is down. Redundancy is important! The higher number indicates the order of preference.
  • MX 20 ALT2.ASPMX.L.GOOGLE.com is yet another backup. You’ll also see many more.
  • A 173.236.239.73 is the IP address that is used to visit the website.
  • SOA server: ns1.dreamhost.com means that the server hosting my website is located at ns1.dreamhost.com. It’s possible to look at DNS records of other websites to determine who their web host is.
To change your DNS settings you would want to visit the place where you registered your domain, not necessarily the people who are hosting your web site (unless they are the same company).
Lastly I’d like to mention that DNS settings take time to propagate throughout the internet. If you change your DNS settings at the company that you bought the domain from, and then immedietly go and run a DNS records look-up tool at network-tools.com, you probably won’t see the changes. A number you’ll see attached to most of the records is a suggested refresh interval such as 14400s or 4 hours. This is a suggested time for routers to update their routing tables throughout the internet.

PHP/MySQL Requirements for WordPress

When you are picking out a web server to host your website, you’ll probably want support for certain web based programming languages and databases (PHP, Perl, Python, MySQL to name some). Even if you don’t plan on writing your own custom code immediately, it is likely that you or someone on your team will want to expand your website’s capabilities in the future. If you plan on using any Content Management Systems such as WordPress, Drupal, Plone, or any others, your web server will need to support the languages that those CMS’s are built on. WordPress requires PHP and MySQL, whereas Plone requires Python.

A good web host will proudly list all of the web languages and services that they support, as well as the current version. For example, the web host Dreamhost currently supports PHP 5 and MySQL 5 and the current WordPress version requires PHP 5.2.0 or greater and MySQL 5.

Conclusion

Hopefully at this point you have seen a very broad overview of how to make a website starting from nothing. As discussed, there are multiple ways of accomplishing each task. As you work on building your website you’ll discover what works best for your needs. Personally I built my first website using Adobe Dreamweaver (a WYSIWYG editor)  and by following the book Dreamweaver CS3: The Missing Manual by David McFarland. My second and third websites were all done in NotePad++ and uploaded via FileZilla. For those I wrote a custom PHP backend interacting with a home built MySQL database. I found the book PHP 6 and MySQL 5 by Larry Ullman to be very helpful for those two projects. My last three websites have all been WordPress based and I am currently learning how to write my own custom themes. You’ll soon realize that each approach towards building a website has its own set of pros and cons.

I would encourage you to do some of your own research in finding a decent web host and domain name registrar. Find a company that you trust and is transparent about their uptime. See if you can find any real customer reviews (I found a lot of fake reviews and fake websites built just for recommending Company X or Company Y). After doing lots of research my personal choice was Dreamhost. If you sign up with them through that link, I will receive a referral bonus (thank you!). I have been a customer of Dreamhost for about 3 years and have been very happy with them. However, do your own research! They are a great host for me, but your needs may be different.

Lastly I’d like to add that this is my first tutorial. Any feedback, criticism, experiences, or opinions are welcome and encouraged. Leave a comment below!