Table of Contents |
---|
...
Algorithms-Specific Informational Websites (not programming challenge websites)
Nick Parlante - Pointers, Binary Trees, etc.
Very good explanations
OK explanations
Books
Look good:
The Algorithm Design Manual by Skiena
Rec'd by Adam D'Angelo in a Quora post.
Essential Algorithms - 5 stars on O'Reilly
Working with Algorithms in Python - O'Reilly video series - 4.3 stars - Same guy who wrote 'Algorithms in a Nutshell'
Maybe good:
Meh:
Articles
http://haseebq.com/how-to-break-into-tech-job-hunting-and-interviews/
First, a high-level roadmap.
Watch the Princeton Coursera course on algorithms (by Robert Sedgewick), mostly up until Dijkstra’s algorithm and not much more than that. Try to follow along and code as you go. You don’t need to implement things in Java unless you already know Java.
Read The Algorithm Design Manual for general algorithms and data structures background. Go through most of it. (You can skip the giant appendix about specific algorithmic topics).
Buy Cracking the Coding Interview. We’ll be using this as a repository of coding problems and solutions.
(...)
If you’re working on an algorithms problem (such as out of CTCI) and can’t figure it out, spend no more than 20-30 minutes without making progress. Just go look up the answer. Contrary to popular belief, most struggling past 30 minutes is pointless.Some people will disagree with this. [NW: Like Tom and Chris Uga.]
Screw those people.
That said, if you find up having to look up the answer to more than 2/3rds of problems, your problem set is too hard and you need to downgrade.
But banging your head against a wall or staring blankly at code is a waste of time. For code/algorithms problems, you very much want to maximize the density of learning per hour.
Any time you look up the answer to a coding problem, try to understand the solution. Walk through it with a debugger if it’s particularly mysterious. Read multiple explanations of why it works.
Then—and this is extremely important—if you didn’t solve the problem completely on your own, treat the problem as though you never solved it at all. Put it back into your rotation of problems. Mark it as something you need to try again from scratch. Wait at least a day, and try to solve it fresh. If you fail, rinse and repeat ad infinitum.
And finally, structure the hell out of your learning. Make a schedule. Know exactly when you’re going to be working on this stuff, when you’re going to take breaks, when you’re going to go to lunch, etc. Build flexibility into your schedule, but have clear goals for how many hours a day you’ll spend studying.
Interview Challenge sites
this is by Kalzumeus
Front-end interview challenge / prep sites
Interview Prep sites
rec'd by Choketsu
Codility
Rec'd by Toptal
rec'd by Choketsu
via Tom A.
Non-Interview-Prep Challenge sites
Rec'd by Toptal
Someone who works with machine learning explaining how Kaggle is a simplification of real ML work:
Jason B Hill's Project Euler solutions
Rec'd by Toptal, Chris Uga
rec'd by Choketsu
How to get started training at the USACO Gateway:
What is it like to attend the USACO training camp?
It's only a week
New people just get 3 hours of lab/lecture in the morning
The top guys do a contest every day. Sometimes the contest goes until after lunch, otherwise they just review after lunch. (Seems like 5 hrs/day)
Then everyone has fun for the rest of the day
If one is stuck on a particular section, is it worthwhile to keep on trying to solve the given problem, or should one move on to other sources in preparation for programming contests?
Richard Peng, one of the most successful Canadian IOI competitors of all time, on the USACO training pages:
"USACO training was put together before the major IOI escalation (Finland/Korea/Wisconsin). A lot of the techniques described on it are no longer useful on *OI and a lot of the 'hot' topics over the past few years are not covered. Also, a lot of the bottle necks on training are quite meaningless, and they typically cause a lot of frustration and time waste on the scale of months."
Algorithms Unlocked
Summary
1. What Are Algorithms and Why Should You Care?
2. How to Describe and Evaluate Computer Algorithms
3. Algorithms for Sorting and Searching
4. A Lower Bound for Sorting and How to Beat It
5. Directed Acyclic Graphs
6. Shortest Paths
7. Algorithms on Strings
8. Foundations of Cryptography
9. Data Compression
10. Hard? Problems
...
Key things I want to remember for the purposes of system design interviews:
Think about how you could break down an app into different services (e.g. login vs searching vs. completing a cart) hosted in different server pools, or how you could shard the data effectively (for example, maybe first by geography, then by user ID).
Summary / Notes
Reduce the Equation
Rule 1 - Don’t Overengineer the Solution
Rule 2 - Design Scale into the Solution (D-I-D Process) - Design Design for 20x capacity, implement Implement for 3x capacity, deploy Deploy for ~1.5x capacity
Rule 3 - Simplify the Solution 3 Times Over
Rule 4 - Reduce DNS Lookups
Rule 5 - Reduce Objects Where Possible
Rule 6 - Use Homogenous Networks
Distribute Your Work
Rule 7 - Design to Clone Things (X Axis) - Be able to duplicate data/services
Rule 8 - Design to Split Different Things (Y Axis) - You can split up your app / data / servers based on either actions the user/app can take (“services”) or by data the app may need to access (“resources”).
Services examples - signup, login, search, browse, view, add-to-cart, purchase/buy - “The data needed to perform any one of these can vary from the data needed for the others”
Resources examples - product catalog, product inventory, user account information, marketing information
Rule 9 - Design to Split Similar Things (Z Axis) - Set data up into unique and separated shards or swimlanes - Separate the db / servers by things like customer ID, name, geography, etc.
Design to Scale Out Horizontally
Rule 10 - Design Your Solution to Scale Out--Not Just Up
Rule 11 - Use Commodity Systems (Goldfish Not Thoroughbreds)
Rule 12 - Scale Out Your Data Centers
Rule 13 - Design to Leverage the Cloud
Use the Right Tools
Rule 14 - Use Databases Appropriately
Rule 15 - Avoid Using Firewalls - Firewalls cause issues with scalability and availability, so only use them when they significantly reduce risk or are required for compliance.
Rule 16 - Actively Use Log Files
Don’t Duplicate Your Work
Rule 17 - Don’t Check Your Work - Don’t read data you just wrote unless it’s a regulatory / legal requirement, and if so, consider doing it asynchronously.
Rule 18 - Stop Redirecting Traffic
Rule 19 - Relax Temporal Constraints - For example, requiring items being available from when a viewer views them until they purchase them.
Use Caching Aggressively
Rule 20 - Use CDNs
Rule 21 - Use ‘Expires’ Headers
Rule 22 - Cache Ajax Calls
Rule 23 - Use Page Caches
Rule 24 - Use Application Caches
Rule 25 - Use Object Caches - These are used for data that may be expensive to regenerate, like the result set of complex db queries.
Rule 26 - Put Object Caches on Their Own Servers
They have a diagram of multiple web servers connecting to an application server object cache, which connects to multiple app servers, which connect to a db object cache, which connects to db servers.
They use the word “tier” to refer to these servers being another entire row in their diagram.
Learn from Your Mistakes
Rule 27 - Learn Aggressively - Use A/B testing, postmortems.
Rule 28 - Don’t Rely on QA to Find Mistakes
Rule 29 - Failing to Design for Rollback Is Designing for Failure - Practice rolling back in staging or QA.
Rule 30 - Discuss and Learn from Failures - Do postmortems
Database Rules
Rule 31 - Be Aware of Costly Relationships
Rule 32 - Use the Right Type of Database Lock
Rule 33 - Don’t Use Multiphase Commits
Rule 34 - Try Not to Use “Select for Update”
Rule 35 - Don’t Select Everything
Design for Fault Tolerance and Graceful Failure
Rule 36 - Design Using Fault Isolative “Swimlanes”
Rule 37 - Never Trust Single Points of Failure - Identify single instances on design diagrams and try to turn them into active/active configurations.
Rule 38 - Avoid Putting Systems in Series
Rule 39 - Ensure You Can Wire On and Off Functions
Avoid or Distribute State
Rule 40 - Strive for Statelessness
Rule 41 - Maintain Sessions in the Browser When Possible - Use cookies, this lets the user request be served by any web server in the pool.
Rule 42 - Use a Distributed Cache for States
Asynchronous Communication and Message Buses
Rule 43 - Communicate Asynchronously As Much As Possible
Rule 44 - Make Sure Your Message Bus Can Scale
Rule 45 - Avoid Overcrowding Your Message Bus
Miscellaneous Rules
Rule 46 - Be Wary of Scaling Through Third Parties - Own your destiny, reduce your total cost of ownership.
Rule 47 - Purge, Archive, and Cost-Justify Storage
Rule 48 - Remove Business Intelligence from Transaction Processing
Rule 49 - Design Your Application to Be Monitored
Rule 50 - Be Competent
Rule Review and Prioritization
...