Book Review: “Think Bayes” by Allen Downey

I vaguely remember ever learned about Bayes’ Theorem in college. Until I read this book I never thought there are so many applications of Bayes’ Theorem in our daily life. The drug companies uses it to get FDA approvals on drugs. The food companies applies the theorem to “prove” certain foods are good for us. In a way, the theorem allows us to change our perception of certain events’ probability of occurring. I can imagine more applications than the ones he outlines in the book. By the way, the pdf format of this book can be freely downloaded here, thanks to the author’s generosity.

Because of this book and his Youtube videos, I got interested in re-learning Python as a practical/useful programming language. It opened my eyes to the wonderful world of Python and its ecosystem.

The book gets pretty technical fast after the first chapter which piqued my interest with his Cookie, M&M, and Monte Hall problems. At the end I don’t know if it’s really necessary to know all the various applications in prediction and simulation, but it could come in handy for people who work in the statistical field.

The key thing to remember is that Bayes Theorem gives a way to update or arrive at the posterior probability of a prior/known hypothesis, H, after learning some new piece of data. The equation is simple: P(H|D)=P(H)*P(D|H)/P(D)
As it turn out, the most difficult part is P(D) or the normalizing factor, which is the probability of seeing the data under any hypotheses at all.

Overall, all the concepts presented by the author make sense to me. I’m not sure I can actually implement and devise the models when the actual problem arrives. But at least I can recognize the Bayesian problem when I see one and know where to find help.

Chapter Summary:
1. Bayes’s Theorem was derived easily and discussed. He introduced the probability calculator of getting a heart attack.

2. Computational Statistics: The author provides a set of Python codes/tools to calculate the result quickly. I had to re-take a Python refresher course before getting any further. But the tools are powerful and help me understand the subject better. All the codes and downloadable from thinkbayes.com website. It took me a while to figure out Monte Hall’s problem – one of those paradox that’s hard to sink in.

3. Estimation: This chapter estimates the posterior probability upon rolling of a multi-face die (4-/6-/8-/12-/20-sides die) and successive rolling of dice (more data points changes the probability distribution), and probability of the number of locomotive given the observation of one locomotive number. This is when you really need to have the computer do the work for you. The intuitive part is that the more information you have, the better your estimation is going to be, as some data points would be ruled out. For example, finding 6 eliminates the 4-sided die.

4. More Estimation: In this chapter, the author covers the “biased” Euro coin problem. The interesting phenomenon is that the prior hypothesis makes little difference (provided you don’t rule things out with 0 probability), the posterior distribution will likely converge with more data points.

5. Odds and Addends: The author covers the “odds” form of the Bayesian Theorem, o(A|D)=o(A)*P(D/A)/P(D|B), which is probably easier to understand for most people. The Addends part of the chapter goes into getting the density and distribution functions.

6. Decision Analysis: Given a prior distribution function, what’s the best decision to make given some data. The author presented a Price Is Right scenario and solve the best price to bid for a show case showdown based the past data and best guess to adjust the bid. The PDF, Probability Density Function, is introduced here. Also the use of KDE (Kernel Density Estimation) is used to smooth PDF that fits the data. This is an interesting application of Bayes Theorem. The math is complicated and couldn’t be done without a computer. I guess it would be good to bring your computer to the Price Is Right game.

7. Prediction: Here the author presented a better way to predict the outcome of a playoff game score based on past data between two teams (Boston Bruin vs. Vancouver Canucks.) This should be no difference from how the gambling industry computes the odds before a big game. Just imagine all the gambling earning you could’ve made by mastering this chapter! I’m sure someone has applied the same idea/theory to the financial market like stocks, bonds and futures.

8. Observer Bias: This is another angle of predicting the outcome (like the wait time for the next train at a given time) taking into account of the observer bias.

9. Two Dimensions: Using the paintball example, the author applies the Bayesian framework to the two-dimensional problem. In addition, the joint distribution , marginal distribution, and conditional distribution. And you don’t think one dimensional Bayesian framework is confusing enough…

10. Approximate Bayesian Computation (ABC): When the likelihood of any particular dataset is 1) very small, 2) expensive to compute, 3) not really what we want. We don’t care of about the likelihood of seeing the exact dataset we saw but any dataset like that.

11. Hypothesis Testing: The Bayes Factor, the ratio of likelihood of a new scenario to that of the baseline, can be used to test the likelihood of a particular hypothesis, e.g. fairness/cheat of an Euro coin. Bayes factor of 1~3 is barely worth mentioning, 3~10 is substantial, 10~30 strong, 30~100 very strong, >100 decisive.

12. Evidence: Test for the strength of evidence. “How strong is the evidence that Alice is better than Bob, given their SAT score?”

13. Simulation: Simulate the tumor growth rate based on prior growth rate and the data points of current tumor size, age and etc.

14. A Hierarchical Model: reflects the structure of the system, with causes at the top and effects at the bottom. Instead of solving the “forward” problem, we can reverse engineer the distribution of the parameters given the data. The Gaiger counter problem demonstrates the connection between causation and hierarchical modeling.

15. Dealing with Dimensions: This last chapter combines all of the lessons so far and applies to the “Belly Button Bacteria” prediction and simulation. This is a very difficult chapter to understand and probably requires the full understanding of the previous lessons.


Book Review: “The Startup of You” by Reid Hoffman and Ben Casnocha

Consider yourself an entrepreneur and your career as a startup company. By developing your competitive advantage, formulating your plan to adapt, networking with others, taking advantage of breakout opportunity by taking intelligent risks, you will become a successful startup yourself, according to the book authors, who had successful startup experiences in themselves.

The authors gave many good tips like budgeting for networking, maintaining a separate identity from your job. At the end of the audiobook (bonus), the authors had a nice panel discussion between themselves and answered a few of my questions. When billionaires give you tips, you’d better listen. The only turn-offs I have are at times the author (Reid is the CEO of Linkedin) appears to be promoting LinkedIn. But overall, it’s a good read.

A quick summary:
1. All Humans are Entrepreneurs:
Why the startup of you? The world is changing, the amount of time you spend at any job is shrinking. We need to act like a entrepreneur, making decisions in an information-poor, time-compressed, resource-constrained environment. Use the backdrop of Detroit, the auto industry, the authors drilled into the points of keeping up with entrepreneur spirit. The “Permanent Beta” mind-set of never finish improving your own skill sets: all the chapters that follow.

2. Develop a competitive advantage:
You are selling your brainpower, skills and energy and doing so in the face of massive competition. Ask yourself, “A company hires me over other professionals because …” What are you offering that’s hard to come by and both rare and valuable? The three factors that influence your competitive advantage: your asset, aspiration and values, and market realities. More below:
Your Assets:
Soft: can’t directly trade for money, intangible contributors to career success: the knowledge and information, professional connections, and the trust you’ve built up, skills you have mastered; your reputation and professional brand; you strengths (things that come easily for you).
Hard: what you list on a balance sheet: cash in the bank, and etc.
Your Aspirations and Values:
Aspirations: your deepest wishes, ideas, goals and vision of future. Values: what’s important to you in life, be it knowledge, autonomy, money, integrity, power and so on.
The Market Reality:
Andreessen’s quote, ” Markets that don’t exist don’t care how smart you are.” You aren’t entitled to anything. All advantages are local; pick a hill that has less competition.

3. Plan to Adapt:
Do both listening to your heart and listing to the market. Be true to your values and vision, yet remain flexible enough to adapt. The author introduced the idea of ABZ planning:
Plan A: what you’re doing now – your current implementation of your competitive advantage.
Plan B: what you pivot to when you need to change either your goal or the route for getting there – generally in the same ballpark as Plan A.
Plan Z: your fall back position: your lifeboat. It’s what allows you to take on Plan A and B.
Prioritize learning: prioritize plans that offer the best chance at learning about yourself and the world. What will grow your soft asset the fastest and the most learning potential.
Learn by doing: actions, not plans, generate lessons that help you test your hypotheses against reality. I particularly like this idea of maintaining a reputation and public portfolio of work that’s not tied to your employer. Then you’ll have a professional identity that you can carry with you as you shift jobs. “The best Plan B is different but very much related to what you’re already doing.” Favor options that let you keep one foot planted while the other one swings to the new territory. Pivot into an adjacent niche.

4. It takes a Network:
“The fastest way to change yourself is to hang out with people who are already the way you want to be.” “I” vs. “We” is a false choice. It’s both. Think I^We (I to the power of “We”) Like dating, have a long-term perspective. See the world from that person’s perspective. Also think how you can help and collaborate with the other person first. Think of it like ballroom dancing – move in unison, perhaps gently guiding or following. Ask “what’s in it for us?” instead of “What’s in it for me?”
Two types of relationships:
1) Professional allies: someone you consult regularly for advice, proactively share and collaborate on opportunities together, talk up/promote to other friends. A relationship that goes from being an exchange partnership to being a true alliance.
2) Weak ties and acquaintances: exposed to more information or job listings you haven’t seen – valuable in the breadth and reach of your network. Here the authors dived in the the LinkedIn functions – a bit of self promoting. Why not? It works.
How to strengthen the network: 1) offer to help. 2) gifts like relevant information and articles, introductions, and etc. 3) let yourself be helped. 4) be a bridge. 5) set up an “interesting people” fund to take people out for meals,
Navigate status dynamic when dealing with powerful people: don’t make them look bad. Actively maintain the relationships you value and consciously let fade those you do not.

5. Pursue Breakout Opportunities:
Opportunities are like the snap to the quarterback in football. Careers, like start-ups, are punctuated with breakouts. Curiosity about industries, people, and jobs will make you alert to professional opportunities. Join information groups and associations as possible. Do the hustle and be resourceful. Be resilient: when the naysayers are laud, turn up the music (Pandora). Keeping your options open is frequently more of a risk than committing to a plan of action. “Making a decision reduces opportunities in the short run, but increases opportunities in the long run.”

6. Take Intelligent Risks: “Risk” in career context is the downside consequences from a given action or decision. Pursue opportunities where other mis-perceive the risk (like Warren Buffett) – lower risk than your peers think, but are still high-reward. Short-term risk increases long-term stability. The volatility paradox: small fires prevent the big burn. Make yourself resilient to the shocks by pursuing those opportunities with some volatility baked in. Non-volatile environment give only an illusion of stability – high chance of a “black swan.” “The only long term answer to risk is resilience.”

7. Who You Know is What You Know:
“How you gather, manage, and use information will determine whether you win or lose.” “What will get you somewhere is being able to access the information you need, when you need it.” Your network is an indispensable source of intelligence. They offer personalized, contextualized advice, and filter information you get from other sources. Progression of literacy of reading/writing in the old days, to now search literacy and “network literacy” (knowing how to conceptualize, access, and benefit from the information flowing through your social network. Pull intelligence from your network by posing questions to the entire network or targeted individuals who are domain experts, who know you well, or just really smart people.
Asking good questions: 1) converse, don’t interrogate, 2) adjust the lens (narrowing the scope), 3) frame and prime (e.g. top 3 things did you NOT get to do and wish you did?), 4) follow up and probe. 5) push interesting information out to your network.

Corrupted Internet Explorer 11 Files – How I Fixed It

While fixing some Windows 7 system issues (like running Internet Explore or Quicken would hang) , I ran a lot of “sfc /scannow” command. And each time, it would complain that there were corrupted files that could not be fixed. (“Windows Resource Protection found corrupt files but was unable to fix some of them.”)

After looking into the log file residing in c:\windows\Log\CBS\CBS.log , I noticed that the majority of the corrupted files were related to Internet Explorer 11. Then I discovered this little “Search Protect” icon showing up in the task bar. Upon further search, I concluded that this was a malware from “Conduit”. I suspected this was the one that caused the Internet Explorer 11 to be corrupted.

First, I need to get rid of this malware. Based on the recommendation form my Google Search, I downloaded JRT (Junkware Removal Tool) and proceeded to remove the “Search Protect” from my system. Well, JRT didn’t quite remove it from the auto-start programs so I had to manually remove it using Microsoft’s autoruns. This was the only way to get rid of the annoying warning message that it couldn’t find the “backgroundcontainer.dll” software (already removed by JRT) upon logging into Windows 7 every time.

Since Internet Explorer 11 was the most up-to-date Internet Explorer, there was no new update to override it. I even tried downloading directly from Microsoft but the official download site was still down level. So I decided uninstall it, which was not a trivial task since Internet Explorer is an integrated software for Windows 7. Based on this recommendation, I would need to deselect Internet Explorer 11 in the Windows Features (Start -> Control Panel -> Programs and Features -> Select “Turning Windows Features On/off” on the left panel -> Deselect “Internet Explorer 10” ). Then go into Windows Update and Uninstall Internet Explorer 11 (Start -> Control Panel -> Click on “Installed Updated” on the left panel on the bottom -> Enter “Internet Explorer” -> Right Click on Windows Internet Explorer 11 ). By doing the above steps, upon reboot, the previous Internet Explorer (in my case IE 9) became the Internet Explorer app.

After doing more “sfc /scannow” and a few more reboots, I was able to run Internet Explorer 9 without any problem and my Quickens App was finally able to run without crashing. Evidently, Quickens uses Microsoft Framework which is integrated tightly with Internet Explorer.

Lessons learned:
1. Watch out for any strange icons on your task bars. Research their purposes. When in doubt, get rid of them so they don’t cause conflicts with other software.

2. Every so often (2 weeks), run “sfc /scannow” to check for any corrupted system files.

Migrating My Defected Ultrabook SSD to a New SSD – My Journey and Lessons Learned

Last few months I have been seeing strange behavior on my Windows 7 Ultrabook – like failing to boot on occasions and getting same Windows updates over and over, certain application wouldn’t install correctly, and etc. I figured the original 240GB SSD may be reaching its end of life, though I have been using my laptop for just 2 years. After downloading HD Tune Pro, my fear was confirmed. The SSD had more than 8% of damaged blocks. I quickly purchased another SSD from Amazon of the “same” size (at least I thought), then the real battle began…

First, the new Samsung SSD was advertised to have 250GB of storage when in fact has only 232.88GB of true storage (Lawyers’ ears should be perked up by now). My original Micron SSD was advertised to have 240GB by Acer when it has 238GB of true storage. So the new SSD is slightly smaller than the old one. This was bad news. It’s much easier to migrate from a small disk to large disk. Lesson #1: always buy a bigger disk than the original. When in doubt, go for the next bigger size. In order to overcome this issue, I had to shrink the original disk: 1) Right click on Computer and Select Manage, 2) Click on Storage->Disk Management, 3) Select the partition to be shrunk, 4) Right Click -> Shrunk Volume). So I managed to shrink just 8GB. This was good enough to scrape by and meet the new SSD size.

Next, I had to clone the disk to the new drive. The new Samsung SSD came with their migration tool. Unfortunately, their tool refused to work because my original disk had data corruption. It’s kind of silly to have a tool that wouldn’t work with the situation it’s called out to do. Then I tried CloneZilla, which was a nice tool I used to clone and backup disk image. It’s good at duplicating disk verbatim, as long as the destination disk is larger than the source disk. But it choked badly on this task because I needed to go from a slightly larger SSD to a smaller SSD. No Go. I tried manually copying the partition but the Windows partitions were a nightmare to copy correctly. I almost gave up and was ready to return the SSD back to Amazon for a next-size, 500GB SSD, which cost another $100. Then I remembered Acronis True Image (2009 version I used), whose “Clone Disk” function was smart enough to skip the empty partition when copying and eventually saved my butt and migrated correctly to the new disk. Another lesson learned, a good disk migration software goes a long way of solving a real challenging problem.

I’ll spare you the painful story I went through in replacing the SSD (12 screws on the cover, 5 screws on the SSD housing) on my Ultrabook. What a relief to see the new disk booted up nicely and performing well. A journey indeed. One last lesson: SSD may be a technology wonder in terms of its high performance but it’s still a way off to have the same reliability as a hard disk.

Book Review: “The Botany of Desire” by Michael Pollan

All these time I thought I was the center of the universe for those plants in my garden, when in fact I was being manipulated by them to propagate their genes, no different than the bees or insects which pollinate them. Michael Pollan in this book provides the perspectives from the plants’ perspective. The author offers very good impelling stories for 4 specific plants: apples, tulips, marijuana (cannabis) and potato. In apples, we sought out sweetness and in tulips we saw beauty, and marijuana we crave for intoxication, and in potato we chose to control.

The DVD version like the book version is quite enjoyable. Seeing the pictures is worth a thousand words in the book. The audiobook makes my commute a breeze.

A quick summary here:

1. Apples: Sweetness
In pursuit of sweetness (hard to come by in the early days), people planted apples to make hard cider (fermented into alcohol) as a “safe” drink instead of then contaminated water. The story of Johnny Appleseed (Chapman) and how this “bum” planted and sold apple trees grown from seeds in the 19th century was interesting. Also the evolution of Apples from its origin in Kazakhstan sounded like a heroic journey. As a lover of apples, I found the Apple story intriguing. So many varieties of Apples (mostly not sweet) were found in its origin and yet we mostly consume a small subset of the Apple species like my favorites: Fuji and Golden Delicious. The lack of genetic diversity was due to the fact that modern apples are mostly grafted because apples don’t grow true from seeds (What’s wrong with Johnny Appleseed?).

2. Tulips: Beauty
I found it hard to believe that until the recent century, flowers were not appreciated, considered pointless. This is probably limited to Western culture. The author went into great details of the Tulips Maniac in Holland in 1635. Interesting tidbits: the highest priced, exotic tulip during the Tulip Maniac was infected with virus such that it gave out a strange color pattern. Another one: 200 million years ago, there were no flowers. The advent of flower creates an interest for the pollinators, which upon being gratified, will do the leg work for the plants. How convenient is the evolution.

3. Marijuana (Cannabis): Intoxication. I got to learn about the ambivalence of law in dealing with legality of using marijuana, especially in Holland. I also learned about the differences between Indica and Sativa – how the two species produces different properties of physchoativity, in human’s quest to grow marijuana indoors due to the unfavorable law. And the author’s encounter with the local police chief while growing a marijuana plant in the back was hilarious. And the political/legal pendulum of using marijuana added some spice to the evolution of this plant in our society. In most of the human history, psychoactive drug/plants (THC in marijuana) play an important role especially in arts, music, literature and even religion, as we often hear about celebrities’ abuse of drug and overdose. I found it fascination that what marijuana does chemically is to allow us to forget, especially the painful experience. I guess it’s a form of numbing that makes us less stressed, hence happier.

4. Potato: Control
In this section, the author touches on genetic engineering like the GMO (trade name: New Leaf) slips of Russet Potato that’s capable of producing a natural BT that kill the potato beetle. Historically, the rise of potato due to its ease of planting brought prosperity to Ireland in 18th century. Then the famine as a result of one fungi that kill off the entire mono-culture potato crop within a week was rather dramatic underling author’s point – the danger of mono-culture for all crops. Since McDonald’s uses only the Russet potatoes to make their famous French Fries, the risk of growing this type of potato are high for pest invasion and their pesticide resistance. We can all learn from the original South Americans who domesticated potatoes planted multiple kinds of potatoes – biodiversity – for disease and pest control.

My Battle with the Attic Rats – an Update

I was too early to claim victory over the rats in the attic of my apartment in my previous blog. The rat(s) came back with a vengeance. I’m assuming it’s more than one for now. Check out what they did do this bait box when I checked it yesterday:
Chewed up Tomcat Bait Box
They were so anxious and probably too large to get to the bait. They almost cut over the bait box with their teeth. I decided to throw in more of the baits into the attics directly to treat them to a large last meal. Will update soon, I hope.
Attic with baits

Learn by Blogging (and Sharing) – Derek Tsai's Personal Blog