Are OFW’s our biggest heroes?

Disclaimer: I am an OFW. I do recognize our part in keeping the Philippine economy running. I know the hardships and sacrifices our kind undergo, working far from home and family, having to work in alien cultures and, on some cases, suffering discrimination and abuse. I know the importance we play in keeping it all together for our country. That our remittances saved the nation time and time again, and without it our economy would probably be in shambles right now.

Let me define the context of the word “hero” first. I refer primarily to a patriot. Somebody with the awareness and deep connection to his nationality, almost bordering jingoism. Someone who has done a meaningful sacrifice, one that cannot be easily duplicated. That sacrifice has to be done with a specific goal in mind: that to the benefit of the community or the nation. 

I do remember trying the greener pasture in a neighboring country not out of a sense of sacrifice for my country, but rather the simplistic notion of getting a bigger paycheck. I don’t wish to apologize for it, certainly not to glamorize, but nonetheless admit that it was utterly selfish when compared to what those I consider heroes should aspire to be. Not entirely selfish since a major part of it is my willingness to provide for my family. Being the bread winner for my retired parents, plus wife and newborn, I think my decision was entirely justified even for just the motive of financial survival.

The thought of doing it to better our economy was non-existent. I know and have heard of the tales of OFW’s ever since I was a kid. The songs and commercials and movies they made to glorify this particular worker type. Not that there’s many of those types of media anyway but the appreciation from the government was there. Was it a lip service or genuine patronage is of no consequence in this article.

Nor have I met anybody do it for the purpose I just described. It was all the same - there is a company in a foreign land willing to pay for my services at a higher rate. Sending those hard-earned cash back to the homeland will somehow triple their value, thanks to the exchange rate. Globalization at its finest, and they think they should take full advantage of it. This is the psychology of a would-be OFW. I bet every single OFW out there will unanimously agree on this.

Now, is that heroic? Maybe for our families. Though, as a father, that is my responsibility. The only difference between working on another country and back home is the pay and proximity to loved ones. Other than that, it’s just plain work - the very promise of capitalism we are all trying to get by with the hope of a comfortable retirement one day. It just the same as going from the provinces to Manila. It’s almost the same as the daily commute we endure to get to our offices or work sites. It’s an eventful act of separation from our comfy beds so we can pay rent and provide food to the table.

Let me paint a picture of heroism that I think is a real representation and has to be aspired by the youth. It has nothing to do with a bigger paycheck. It may contribute to a blissful retirement one day. It does separate from loved ones but not with a great distance. The heroes I refer to are the public school teachers. I know teaching is the noblest of professions, but they do it on an entirely different leve. Low wages, 60 students per class if you are lucky, decades old equipment, amongst other things. One could argue that they do it for the wages anyway and not for the love of kids or the country, but there is one event some of them show this is not truly the case. I haven’t seen this done by an OFW nor anything that remotely resembles it.

I speak of national elections. They have the fortunate additional task of doing the actual electoral process, from hosting the voting centers, to manual counting to distribution of ballots and the like. Now, elections in the Philippines, especially in rural areas, are not as orderly as in other developed nations. They are full of sensationalism, violence and is mostly a popularity contest. Owe it to the corrupt culture where candidates are usually vying for position in order to steal money to recoup their campaign expense. Most importantly, owe it to the voters who do not know better.

Some of these teachers have faced gunpoint from corrupt candidates wanting to take the ballots for whatever scheme they intend for their victory. Many of these teachers refused then at the cost of their lives. All of them are too underpaid to really protect these pieces of paper with their lives, much less to be responsible for this civic duty in the first place. Those that died, did so because they believed in the power of the ballot and the importance of elections. Whether elections in the Philippines actually represent the opinion of the majority aka the masses, or whether they really make progress for the country is worthy of another discussion. This particular act of sacrifice presupposes a level of thinking that takes into consideration an abstract notion of a democracy; something they can never really feed their families with nor guarantee them an actual retirement. A really high level of consciousness indeed. Heroes in every angle, national heroes, worthy of respect and admiration.

Now, contrast this with a typical OFW, like me. I believe to some degree that, given a chance to switch citizenship to a more developed country most Filipinos would do so without second thought. Especially if they can bring over their family and have a job waiting for them on the new land. I don’t see anything wrong here - many people from any civilization from any point of human history have done this thing. It’s a natural human trait. Entirely justified.

Is that a heroic attitude? 

I stil love my country, and would like to go there every now and then. This is mostly due to the idea that there’s no place like home, even if your home is a poorly-run joke fest of a republic. I still want to wear my Filipino pride everywhere I go, but if there’s a slight chance for a better life somewhere else for my family I will definitely give the latter a go. 

Contrast our sacrifices to that of public school teachers especially during election time. Contrast that to the soldiers fighting our fellow countrymen in the South. Contrast that to the honest and hard working employees of the civil service. 

I am not belittling the conditions that some other OFW’s are undergoing right now. The psychological burden of being away from family, especially your children can never be put into words. Add to that the possibility of one day seeing your child, whom all your hard work is dedicated to, no longer recognizes your face. 

I’m simply saying that these unsung heroes need more spotlight. We call OFW’s modern heroes but there are a lot of heroes in our country. If you intend every child in our country to aspire to be an OFW we will not only experience an even worse brain dran but an actual exodus, where none are actually willing to come back given the chance. We need heroes that have attained a level of consciousness that goes past the rudimentary everyday worries of a salaried worker. We need icons of real patriotism rooted in hard work and a genuine love for their community. If only we can vote for public school teachers.

Motivation is very important when choosing icons on which we hope the younger generations will emulate. There is absolutely nothing wrong with the motivations of OFW’s such as me, but the title of a “modern hero” deserves a higher standard.

justmigrate:

Hi,
I just moved my posts from Posterous! Do go though my blog for all the new posts.
Its easy to migrate try JustMigrate
3Crumbs app - Are you the local thrifter we all have been looking for? 

justmigrate:

Hi,

I just moved my posts from Posterous! Do go though my blog for all the new posts.

Its easy to migrate try JustMigrate

3Crumbs app - Are you the local thrifter we all have been looking for? 

MongoDB lessons, or how not to f* up hard on your first use

My current company just concluded our inaugural online competition for local school kids, and there had been quite a handful of lessons learned with regards to making things scale for a simple online competition. I spearheaded the development and was provided support by my superb teammates, and the biggest stumbling block of all is about dealing with database gotchas. 

I chose MongoDB because it seemed very lightweight and the ease of changing the models. I could live without joins - I found them very slow anyway. Data consistency can be enforced by ensuring simplicity in how your models interact, and since our project was projected to be a lightweight application (it is), I know I can pull it off as long as I keep things small and simple. 

Being an online competition for school smarts the stack was relatively straightforward:

  • Tornado as the web server and also serves static files 
  • Nginx for load balancing. Now I think I should have gone for Nginx to solely serve static files though.
  • Node.js game server using Socket.io for message passing between game clients as well as the multiplayer queue
  • MongoDB as the database

Then came the height of the competition where the top leader spots are being cemented, and then the mistakes and lack of familiarity became more apparent. I guess I was too used with cloud services that I’ve began to neglect best practices in database setup. This was compounded by the fact that our awesome backend guy is also working with MongoDB for the first time. Suffice to say, we all had our share of surprises.

Here is a short list of things to remember when trying out MongoDB the first time for production use, in no particular order:

  1. Use an asynchronous library. I used PyMongo behind the Tornado web server, and while Tornado was fast as advertised and PyMongo is reliable I should have used an async library like Motor (a fork of PyMongo). I think the reason for this choice is just me being intimidated by callback hell.
  2. Create your indexes as you develop. This makes lookups faster and you can be very flexible contextually on which fields to use. This applies to other databases as well.
  3. Shard from day one. If you expect your application to scale then sharding should be part of the initial database setup. This also helps support #2 since indexes take up space. Sharding also assumes a replica set, which will help bring in secondaries when your primary goes down.

Now that’s a really short list but if I were to do it again, this will be on the top of the to-do’s. Unfortunately, MongoDB fell out of the team’s favor and we may not be using it for some time. Still, the lessons learned can be applied in essence should we use another NoSQL solution.

Yet Another Steve Jobs Bio Post

I have done reading Steve Job’s biography by Walter Isaacson and have been spotting several reactions from bloggers of different backgrounds almost every week. There doesn’t seem to be a consensus on what really made Steve Jobs successful, and the varying opinions would always either extrapolate or deviate from the supposed Reality Distortion Field. I could also see many so-called product people putting their worst behavior towards their peers, on varying levels, in the attempt to emulate the product development success. This trend seems to have stemmed not from his death but rather the biography, as the only approved account of his life, both personal and professional.

I must admit that I had the temptation of letting go of my inhibitions and simply just say what I have in mind, regardless of how it will affect other people, be it at work or at home. It came from the hope that by being brutally honest and vocal I can replicate such a success. Add to that a binary view of the world where something is either shit or is the shit, and what we have is a persona that pushes away people most of the time.

I also had the pleasure of watching the House MD series right from the first season, as well as the occassional re-run of the Iron Man movies. Here the parallels are striking: some guy with amazing talent or expertise on a very narrow subject matter is allowed to have his way and be an asshole provided he delivers. The behavior exemplified is easily childish and egotistical; the hubris and the drama makes for a great story, just like what I’ve read in the biography. A technologist or expert that struggles with his inner demos and tries to put up the arrogance mask all the while saving the day and making the right moves when it counts. It’s one thing to have those kind of personalities in fiction; it’s another thing to have a living proof albeit an outlier.

Back to Steve. What really struck me on many of his stories, both in the book and from other accounts, is his seemingly overwhelming awareness of himself. I know he can’t be totally honest with others as most people are, but he seems to know exactly what he wants which somehow pushes the right buttons on people to agree with his agenda. This may sound like a rewording of the Reality Distortion Field but I really think there is nothing to distort; it is a given that us humans exercise this great power of shaping our environments and our fortunes by sheer will, and to do so may imply a herculean effort but isn’t necessarily supernatural or even mystical. He was constantly shaping and building, which came from his passion of marrying art with technology. This existentialist conclusion concludes that nothing in life is set in stone, and that everything can be twisted and sculpted - it’s always just a matter of time.

This kind of mindset somehow echoes that of the Zen attitude towards life and reality, at least from what little I understood. I knew Steve practiced a plethora of Eastern disciplines especially as a young adult, Zen Buddhism included, and being part of the hippie movement made him prioritize his spiritual walk over many other practicalities in life. His spiritual search, I think, led him to the realization that he is virtually alone in life, and that really deep down he is nothing special; this allowed him to work things from scratch and direct his time and energy to things that really matter to him. Other people being swayed by him is simply a by-product of having a clear vision of where he needs to go as a person. The products he helped build are a testament to the clarity of purpose and the design considerations that runs through the mind of somebody that has seen the future; not in a mystical sense of the word but simply wanting something to happen and believing it because everything else in life is just some other people’s wish coming true.

By being committed to his vision he didn’t have rationalize so many things, especially when running Apple or overseeing the development of a certain product. There is a mental picture, and all he has to do was execute. This obviously is really nothing new, and we have thousands of stories like this, not to mention that this happens on a daily basis. The success of Apple as a company has to be seeded by this, although there are many other factors that contributed to it, mostly through other people which is out of Steve’s control. I bet anybody who has this clear vision of the future and a drive to seeing it into fruition can achieve success in his own endeavor. It may not be in the scale of Apple but if we are to care about dichotomies winning is winning - to achieve one’s goals is just as productive as completing as self-imposed task. The kind of stamina and faith required to achieve such things had to be inherent in most of us.

His was a testament to the ability of our species to be the master of its own destiny. Steve isn’t the first nor the last in the long line of truly great stories, most of which we didn’t actually get to hear. I’m not trying to downplay his achievements but I’m not also trying to put it in a totally different pedestal. It is just up to the individual to find out what he really wants in life, which is always the first step.

My local node.js mailing list

I forgot to spread this around so I might as well post it here. I have created a mailing list for local node.js users/hackers/enthusiasts:

http://groups.google.com/group/node-js-philippines/

Hope to see you there ;-)

Plans for 2012

Admittedly, I didn’t go for the usual intensive introspection routine I usually do when the year turns. Instead, I saw the end of 2011 as having a lot of unfinished business and new insights that I feel I need to see come into fruition. That being said, here’s what I plan to do (and not do) for the rest of the year.

This is not a new year’s resolution list; this is more of a concrete plan on what I need to focus on, at least on the first part of the year.

Blog more

Most of my blog posts have been on the technical side, and I realized that since this blog is also my site carrying my name as the domain I might as well open up with the other facets of my life. Granted, it isn’t much, but there are still stuff that can be shared and moments to capture. In short, more frequent posts with more diverse topics.

Learn a new programming stack.

I made a survey on what could be both challenging and beneficial to me as a freelancer, and I settled on IOS development using Objective-C. I passed on the other choices (server-side Javascript, RoR, Haskell etc.), which I’ll still be touching from time to time.

The initial readings proved quite a challenge coming from Python - I think I was babysitted by such an expressive language that going back to my C roots introduced some sort of a culture shock; or maybe it’s just that Python grew into me and now I think more like a Pythonista for using it everyday.

This will be quite a ride, and the end game would be to publish in the App Store that can supplement my income.

Complete the Stanford online class I signed up 

I was a bit disappointed with how my Machine Learning stint came about - I didn’t find the whole process of working with data very engaging. I could change my mind in the future but somehow I don’t have the statistician blood in me.

So when Stanford offered a slew of online classes for 2012 I signed up to all, with the intent of downloading the video lectures for future reference and focusing on just one class. The class I’ve chosen is the Human-Computer Interaction, which focuses heavily on interface design and development. I would leverage my knowledge here when writing IOS apps so I could relate more with users and do a better job than what I did with CVStash

I could also use another class as a second focus, but I have yet to see that with my current schedule.

Prepare for my wedding.

I’m tying the knot with my long time girlfriend, and I need to prepare a bunch of stuff (legal, financial, ceremonial, etc.) to make sure we have a memorable moment come summer time. I have most things laid out because I was initially planning on a short civil wedding last December, but my significant other wanted more formality and we agreed halfway so we went for a ceremony with quite a handful of guests come summer 2012.

Why I included this here is because such commitments do not start nor end on the wedding night. I wish to make my transition to the married life a smooth one, especially in terms of schedule. Relationships and family takes time and effort, and what I do to make it worthwhile must coincide wtth my other goals so happiness can be complete all throughout.

Get back into shape

I have almost completed a vertical jump program during our village’ version of a league “lockout”, mainly due to flooded streets. I have yet to notice any change in my jump but during the training I shed some pounds and felt good throughout the day because of the intense workout. I plan on restarting the regimen, but I could probably supplement it with simple yoga exercises for more flexibility.

I also wish to do running around the village and in the C6 highway, but that may not be too feasible. I have yet to work on a schedule to accomplish this.

Brush up on my maths

This will be an ongoing struggle, having quite a stigma for mathematics most of my youth but actively worked on getting the simplest of concepts back into the limelight the past year. I owe a lot to Khan Academy, and I plan to do a weekly review and lesson using the site. I plan to rework my Calculus by summer and start to be comfortable with Linear Algebra by the end of the year.

Conclusion

The year 2011 have been fruitful and enlightening to me, having given me ways to see where I’m at on all fronts, be it personal, professional, spiritual and financial. I plan to take it up a notch this year with more focus and less clutter. It’s a good start, and when we’re done we should all feel good about ourselves having mastered our time in the way we deem most useful

Getting Sparrow scaffold to work on XCode 4.2 in Mac OSX Lion

I’ve been working on getting myself familiar with IOS development, and I went on to learn how to make simple games. The open source project Sparrow looked to be a good start, but I encountered some problems getting it to run on the first try. Here’s what I did to go around it:

  1. Make sure you have git installed.
  2. Clone the project’s master branch, which should contain the latest stable release: git clone git://github.com/PrimaryFeather/Sparrow-Framework.git
  3. Run the demo in Sparrow-Framework/samples/demo/src/Demo.xcodeproj
  4. If you run into workspace errors, close XCode as well as the IOS Simulator and reopen the Demo project. 
  5. Once you can play around with the demo, setup the Sparrow source tree in XCode. Go to XCode Preferences > Locations > Source Trees tab. Add the entry SPARROW_SRC pointing to where you have the sparrow src/ folder (mine is /Users/arbiesamong/ios/sparrow/Sparrow-Framework/sparrow/src). Make sure there are no spaces in the path. You know it’s the correct folder if you can see the file Sparrow.xcodeproj
  6. Now we must verify if the scaffold runs. Close all XCode and IOS Simulator windows.
  7. Copy Sparrow-Framework/samples/scaffold/ to a directory of your choosing.
  8. Open scaffold/src/AppScaffold.xcodeproj 
  9. In the Project Navigator pane, click on the AppScaffold project.
  10. Verify that the Sparrow subproject, Sparrow.xcodeproj is in blue. This means that the Sparrow source can be found. If not, try to set the Location in the Utilities pane on the right to “Relative to {{ sparrow_source_tree_name }}”. 
  11. Try to run the project. You should see a red square over a black background when the simulator loads the code.

That’s it! Hopefully you get to create the next Angry Birds using this scaffold project. :P

Sets in Python

Sets in Python

Python defines a set as a list with unique elements. The function set() creates such sets, and one can create from a list by doing

mylist = [1,2,3]
myset = set(mylist)
myset # set([1, 2, 3])
type(myset) #

If there are duplicate elements in #mylist#, the duplicates are discarded leaving only one instance of such an element. Furthermore, the resulting set will be sorted automatically.

We also know that strings in Python can also be treated as a list of characters although they are not equal in terms of type. This similarity is highlighted when we compare a set derived from a string versus a set of a list of characters:

'abc' == ['a','b','c'] # False
set('abc') == set(['a','b','c']) # True

Traditional set operations are also available:

set1 = set(['m','a','k','a','t','i'])
set2 = set(['m','a','n','i','l','a'])
set1 - set2 # set(['k', 't']) aka xor
set1 | set2 # set(['a', 'i', 'k', 'm', 'l', 'n', 't']) aka union
set1 & set2 # set(['a', 'i', 'm']) aka intersection
set1 ^ set2 # set(['k', 'l', 'n', 't']) aka disjoint

Membership in a set can be queries pretty much like how one does with lists:

22 in set([1,11,22,33]) # True
'sam' in 'arbie samong' # True
'bes' in 'arbie samong' # False

Lastly, we can make comparision operations between sets:

set('cake') <= set('cherry cake') # True
set('cake').issubset(set('cherry cake')) # True
set('cake') > set('pie') # False

Emacs cheat sheet for beginners (from the built-in tutorial)

Disclaimer: I’m a VIM guy for a year, and a GEdit guy for years. I don’t blog about it because I use it everyday, and everyday stuff is just part of me so I don’t feel it needs extensive writing at the moment.

Somehow things changed when I got my Macbook Air, and the Escape key is no longer pleasant to work with. So I’m now in the process of trying another editor for daily use, and none of the IDEs available really give me a kick in terms of utilizing finger speed. So I’m writing down a list of the most basic of Emacs commands to memorize in case I forgot.

First off, I found emacs alredy bundled in the MBA, and accessible in the terminalv via the emacs command. Also, you’d want to make full use of the alt key, so you go to your Terminal > Prefrences > Keyboard and check “Use option as Meta key”. These commands were given in the built-in tutorial inside Emacs, accessible via Ctrl + h, t (push the Ctrl key, hold it, hit h, release, then hit t). Now for the commands:

Screen navigation

Ctrl + v View next screen

Alt + v View previous screen

Ctrl + l Clear screen and center to cursor

Alt + < Move to beginning of file

Alt + < Move to end of file

Cursor navigation

Ctrl + n Next line

Ctrl + p Previous line

Ctrl + f Move cursor forward one character

Ctrl + b Move cursor backward one character

More cursor navigation

Alt + f Move cursor forward one word

Alt + b Move cursor backward one word

Ctrl + a Move to beginning of line

Ctrl + e Move to end of line

Alt + a Move back to beginning of the sentence

Alt + e Move back to end of the sentence

Basic utility commands

Ctrl + u <n> <item> Execute item n times. Item can be command or character.

Ctrl + g Cancel current command

Ctrl + z Suspend Emacs; actually a UNIX command. Type fg in terminal to resume.

Ctrl + x, Ctrl + c Exit Emacs


Deleting

Backspace (or delete in MBA keyboard) Delete previous character

Ctrl + d Delete next character

Alt + backspace Kill (aka Cut) previous word

Alt + d Kill next word

Ctrl + k Kill to end of the line

Alt + k Kill to end of sentence

Copy + paste

Ctrl + Space Start marking for kill (Mark set)

Ctrl + w Kill from Mark set to current position

Ctrl + y Yank aka Paste

Alt + y Yank with previous kill (hit repeatedly to cycle through previous other kills)

Undo

Ctrl + x, u Undo

Ctrl + _ Undo

Files

Ctrl + x, Ctrl + f <filename> Find (open existing or as new) a file

Ctrl + x, Ctrl + s Save

Buffers

Ctrl + x, Ctrl + b List all buffers

Ctrl + x, b <buffer> Switch to buffer

Ctrl + x, 1 Close other buffers

Ctrl + x, s Save some buffers (has confirmation)

Windows

Ctrl + u, 0, Ctrl + l Mark start of bottom window display

Ctrl + x, 2 Display new window where cursor is placed on marked start

Ctrl + Alt + v Scroll down bottom window page

Ctrl + o Switch windows

Ctrl + x, 4, Ctrl + f Open file in new window (has prompt for filename)

Ctrl + x, 1 Kill other windows

Recursive Editing

Alt + x Get into a mini buffer

Esc, Esc, Esc Get out of mini buffer and close all windows

Editing modes

Alt + x, <mode> Change to this mode

Ctrl + h, m Show help on current major mode

Ctrl + x, u, <n>, Ctrl + x, f Set margin to n characters

Alt + q Refill the paragraph under the Auto Fill minor mode

Search and replace

Ctrl + s Search (has prompt)

Alt + x, replace-string Replace string (has prompt)

Getting help

Ctrl + h, c, <command> Show short help on a command

Ctrl + h, k, <command> Show long help on a command

Ctrl + h, f, <function> Show help on a function

Ctrl + h, a <string> Show all commands that match string

Ctrl + h, i Read online manual

That’s it. I’ve tried to list the commands in sequential order as they appear in the tutorial but also grouped what I think are related commands. Hope this helps us be familiar with basic Emacs usage as we try to use it in our everyday editing. This is obviously best used as reference after completing the short tutorial.

Gradient Descent for Linear Regression

We’ll continue our exploration of linear regression, and now we’ll be discussing about the gradient decent algorithm that will help us find the minimum for our cost function. Most of what’s discussed here including the graphical media are courtesy of Stanford University’s online Machine Learning class: http://ml-class.org. It’s free to join so I suggest you go there and sign up if you wish to learn Machine Learning from the experts. My posts will just serve as personal notes plus my interpretations, though in some ways this can be seen as a tutorial of sorts. The original material and a better lecturer by the person of Prof. Andrew Ng can be found on the machine learning class site.

Gradient descent is a general algorithm that helps us arrive at some minimum value for a function. This can be used in our cost function to find the best hypothesis that will fit our training set. Note that gradient descent is not just used in cost function scenario but in other applications as well; basically those that needs to find a minimum.

In our case, we have our cost function J(Ɵ0, Ɵ1) and we basically want to get the minimum for this. Since gradient descent is an algorithm (not a formula), we’ll do good to describe how it goes:

  1. Start with some (Ɵ0, Ɵ1). This will be our arbitrary starting point to kick off the algorithm. Usually we can start with Ɵ0=0 and Ɵ1=0, but this could actually be any starting point.
  2. Keep changing the values (Ɵ0, Ɵ1) to reduce our cost function J(Ɵ0, Ɵ1), until we end up with a minimum value.

Simple, isn’t it? :D

Now, let’s visualize for a moment our cost function graph which will now come in 3d since we will not be assuming Ɵ0=0. This means we have 3 variables, the cost function J(Ɵ0, Ɵ1), and then Ɵ0 and Ɵ1.

You’ll notice that this looks more varied than the bowl-shape we have the last time. This is just a helpful visualization, but for linear regression problems this graph will stay in its bowl-shaped figure.

The nice thing about this graph, though, is it looks like a typical hillside scene. I’m not trying to just get your imagination running here, as this metaphor is actually *the* basis for gradient descent.

Remember, our goal here is to arrive at a minimum point for our function. Using the 3d graph, we can imagine the low points to be those that are in blue and are obviously situated lower on the vertical axis. Using gradient descent, we then pick some arbitrary starting point (the violet dot):

Now, imagine that you are that starting point. You’ll see that you are in a place in the hill, and you basically want to go down to the lowest point. So you take a 360-degree look around and think of what direction to step to. You take a baby step down, and then check again for the next step. You do this until you get to the lowest point aka the algorithm converging to the lowest point.

We’ll call the lowest point you have arrived to be the local minimum. This is in contrast with the global minimum, which is the lowest point for the whole graph. The local minimum is just the lowest point nearest to your starting point as deduced by the gradience descent algorithm. We need to make this distinction because if we pick a different starting point (even if it is near the original starting point), we will end up with a different lowest point aka local minimum:

Here we see the green dot starting off just a bit to the right of the original purple starting point, and it finds a different lowest point that is quite far from the original. This shows that the local minimum will largely vary from the given starting point, and this is a property of gradient descent. Another thing about the local minimum is that, if a value like Ɵ1 has already reached it, the gradient descent algorithm will leave Ɵ1 unchanged on its next iteration.

Now we can express this idea mathematically. We describe the gradient descent algorithm to be:

Let’s take this apart:

  1. We need to perform the operation descibed in between the curly brackets until convergence aka a loop.
  2. Ɵj holds the current value.
  3. We are using :=, meaning an assignment operator, in contrast with the = sign, which is a truth assertion. The symbol := basically says that we assign the value in the right to the variable in the left.
  4. The α character is called the learning rate or the alpha value of the algorithm. It describes how big are the steps to take per iteration. If alpha is too small, the descent will be too slow. If it is too big, it can overshoot the minimum and go overboard. This latter case could cause the algorithm to fail to converge, or even diverge.
  5. The rest of the right hand side of the assignment is a derivative. I’ll explain this more in a while.
  6. The statement in parenthesis on the right is a subtlety in gradient descent. It says that we need to update Ɵ0 and Ɵ1 simultaneously. To describe more, here is a comparison of the correct way of updating Ɵ0 and Ɵ1 to the wrong way:

All this is saying is that Ɵ0 and Ɵ1 both need to be tracked of their values for the given iteration. Their values will only get updated as we enter the next iteration. The ‘incorrect’ way descibed here forms yet another algorithm altogether, which provides a very different behavior from gradient descent and is not covered here. To give a quick illustration, if Ɵ0 = 1 and Ɵ0 = 2, if we apply gradient descent for just one step/iteration we will get:

  • Ɵ0 := Ɵ0 + sqrt(Ɵ0*Ɵ1)
  • Ɵ0 := 1 + sqrt(1*2)
  • Ɵ0 := 1 + sqrt(2)

and

  • Ɵ1 := Ɵ1 + sqrt(Ɵ0*Ɵ1)
  • Ɵ1 := 2 + sqrt(1*2)
  • Ɵ1 := 2 + sqrt(2)

which clearly shows how we used the values of Ɵ0 and Ɵ1 simulataneously, aka we don’t update the values until the next iteration.

Now let’s discuss a bit about the derivative part of the algorithm:

This simplest explanation for what is this is that it describes a slope that touches the point for the current value of whatever parameter it has. To illustrate this graphically, let’s assume Ɵ0 = 0, and we only have Ɵ1. Suppose our cost function graph looks like the usual parabola, and we picked a certain arbitrary value for Ɵ1:

Since gradient descent will update the value of Ɵ1, the derivative part of the algorithm is the slope to which the value of Ɵ1 coincides with the line of the function, like this:

We can also see this as the slope to which the value of Ɵ1 must “travel” in order to reach the minimum. Since we have our slope on the right side of the parabola, we say that our derivative is a positive number. If it is on the left side, then it will be a negative number, as it is a negative slope.

Another property of this derivative is that if we are at the local minimum, the value of the derivative is zero, since the slope will be a flat horizontal line.

The derivative is important because it describes the behavior of the algorithm as it reaches the minimum. As we approach the local minimum, gradient descent will automatically take smaller steps. This makes it unnecesary to decrease the learning rate over time as this is taken care of by the derivative. This means that the derivative actually controls how much the learning rate progresses throughout the iterations. This, in effect, makes the algorithm take smaller steps over time:

And basically it’s this derivative term that enables the gradient descent algorithm to actually minimize the value of our cost function. This also means that the algorithm can coverge even if the learning rate is kept fixed (though the learning rate must not be too large). Here is our algorithm beside the linear regression model we discussed before:

So, in order to achieve the minimalization process using gradient descent, we need to change our derivative to fit the cost function:

Here we simply substituted J(Ɵ0, Ɵ1) with our partial derivative term and also spelled out our hypothesis function. Now we can apply this to a single step in our gradient descent algorithm, with Ɵ0 and Ɵ1 as our parameters in the context of our hypothesis function we get:

If we put those two together we get our modified gradient descent algorithm which now actually reflects what we are trying to do with our cost function:

So let’s see the algorithm in action. Here we have the familar graph of our hypothesis function against the training data on the left, and on the right our cost function represented as a contour figure (since it now in 3d: J(Ɵ0,Ɵ1), Ɵ0 and Ɵ1).

Next, we pick some arbitrary value to start with. It is usually in (0,0) for (Ɵ0, Ɵ1), but now let’s start off with (-900, -0.1).

Suppose we take one step to the left, which also changes our line for our hypothesis function:

Then another step. Notice how the line of our hypothesis function changes on the left graph:

We keep adjusting the point in our cost function graph on the right until we get to the local minimum (also the global minimum here since the contour graph translates into a 3d bowl-shaped figure called a convex function, which is just the 3d equivalent of a parabola). This is important as all linear regression problems actually don’t have a local minimum and just one global minimum. As we get to the local minimum we see our hypothesis function fit more nicely with our samples:

Now we can make our prediction. If we have a house around 200 square feet we can sell it at around $150k to $170k.

It is worth noting that there are other version of gradient descent that behave differently from the one we described here. This version of gradience descent is technically called Batch Gradient Descent. The batch word alludes to the fact that we take into consideration all the training samples in our calculation, as seen in our formula for our cost function where we take a summation. We will talk about those other versions as we progress through our study.

So that wraps up our discussion about gradient descent. Here we learn our very first machine learning algorithm in its entirety. Congratulations and hopefully you’ll stick around for more machine learning goodness.