Nathan Wailes - Blog - GitHub - LinkedIn - Patreon - Reddit - Stack Overflow - Twitter - YouTube
Algorithms, Data Structures, System Design and Interview Questions
- 1.1 Related pages
- 2 Non-algorithm-specific interview advice
- 3 Algorithms
- 3.1 A step-by-step process for solving algorithm questions
- 3.2 How to study
- 3.3 Data structures
- 3.3.1 Questions
- 3.4 Learning Resources
- 3.4.1 Algorithms Unlocked
- 3.5 Topics
- 3.5.1 Linked Lists
- 3.5.2 Databases
- 3.5.3 Dynamic Programming (DP)
- 3.5.4 Heaps
- 3.5.5 Recursion
- 3.5.6 Trees
- 3.5.6.1 Binary Trees
- 3.5.7 The interview process
- 3.6 Flashcards
- 4 System Design
- 5 Resumes
Related pages
My overall strategy on this page is to 1) summarize information I find around the web, and then 2) try to weave the information I have in those summaries into unified processes for handling DSA questions, system design questions, API design questions, how to format a resume, etc.
Non-algorithm-specific interview advice
Algorithms
A step-by-step process for solving algorithm questions
Understand that it is critical that you think out loud for the steps that follow.
Sources:
https://www.youtube.com/watch?v=7FJ5LQNOPmc&t=4m3s
The next issue that most people have is that they're a little too quiet, and I get it. A lot of us try to code on our own late at night without talking to anyone, but when you're in the workforce, you have to show how well you can communicate with others. The best way to do that is by thinking out loud. So when you're writing your code, definitely express what you're thinking line by line as you're coding it out.
Clarify the question with the interviewer.
Be sure you understand the limits of the input.
Define some of your own test cases.
Be sure you understand what the output should be for the test cases you define.
Sources:
https://www.youtube.com/watch?v=7FJ5LQNOPmc&t=3m40s
The biggest mistake that most candidates make is not clarifying the question. I think people are a little too eager to start coding, and they end up solving the wrong problem or forgetting a few test cases. The simple thing that most people can do to start clarifying the problem is just to ask questions about it, understand what the limits of the problem are, define their own test cases, and understand what the input should be for the expected output.
“What the input should be for the expected output”? Huh? Did he mean “what the output should be for a given input”?
Draw out a simple visual example.
Do it in something like Balsamiq.
Copy-paste it to visually show what the step-by-step process should look like.
Think about what variables you want at each step to answer the relevant question you have to be able to calculate the desired answer.
Ask yourself: What features of the state of each ‘step’ through the input can you think of to assign names to? Write the code needed to calculate/update these values with each iteration (or just have them as stub functions to be filled in later) and it could give you enough of a step in the right direction for the rest of the solution to make itself clear.
Step through the code with a simple example.
2024.09.14 - I’m thinking it may make sense to keep track of variables in a separate comment block, keeping track of the call stack / execution contexts as well. So basically doing what Jetbrains does, but manually.
One thing I’m noticing is that if I have the comment block with the variables above the code, then I can’t easily reference line numbers in the code because adding new lines to my comment block changes the line numbers of the code. So it seems better to keep the comment block below the code.
After you write out your code, step through it line-by-line with at least one of the test cases.
Sources:
https://www.youtube.com/watch?v=7FJ5LQNOPmc&t=4m58s
The other most common mistake that I've seen other people make is assuming that their code works, and what I mean by this is that after they go through the entire interview, they're just so glad to write down all the code that they forget to test if it actually works or does what it's supposed to do. So what candidates should be doing here is to actually go through their code line by line against one of the test cases that they outlined at the beginning of the interview. If you do that, then it's not just assuming that it works, it actually does work, and it shows the interviewer that you can really communicate your code and what's going on.
Explain the running-time of your code by walking through the code and explaining the running-time of each area.
Sources:
https://www.youtube.com/watch?v=7FJ5LQNOPmc&t=4m26s
The next topic I want to talk about is that people tend to struggle a little bit with the running time. What I mean by this is that most people feel like they have to just blurt out the running time or answer it like it's a multiple-choice exam or some other homework question. When someone asks you what the running time is, people don't want you to just say it's linear or n squared or whatever. They really want you to explain different areas of your code and what the running time is for those areas. If you can do that, you'll stand out a lot more than people who are just treating it like another exam question.
How to study
Memorize how to work with basic data structures (CRUD and search operations).
lists, dicts, sets, deques
Memorize some basic algorithms that are used as building blocks in many LeetCode questions.
Bookmark the two links below in a Chrome bookmark folder so you can quickly open all three links every day (right-click the folder and select “Open all”):
Every day, learn one of the basic algorithms listed in this pastebin if you haven’t already learned all of them.
Start from the top and work your way down.
To “learn” an algorithm:
Read the pastebin until you feel like you understand how the algorithm works.
Write out the algorithm on online-python.com with the pastebin open in a separate window (have the two windows open side-by-side).
I recommend turning off their autocompletion feature as I find it more of a hindrance than a help.
Set a timer for 60 minutes and then see if you can write out the algorithm on online-python.com from memory.
If you can’t, just keep trying that day until you can. Adjust the timer duration as necessary.
Every day, review two of the basic algorithms you’ve already learned.
Write out the algorithm, including a test case, from memory on online-python.com
You should be able to do this review in ~5 minutes.
Memorize the NeetCode 150.
I put all of the Neetcode 150 in Anki, separated by question type.
Creating Replit versions:
If I can’t access the question on Leetcode (it’s a premium question) or the Lintcode version (or if the Lintcode version is too different from the Leetcode question), I can create a Replit version in here.
Make the ‘template’ version of the question, then when you actually want to do the question, use that same Replit and you can just use the git feature to discard all of the changes you made to the code when you’re done.
Thought: I want to have a way to separately review a high-level description of how to solve a problem vs. reviewing the writing out of the actual solution. Like, I want to be able to do 10 high-level solutions per day (where I just have to remember a plain-English sentence or two about how to solve a question) and 1 written-out solution (where I actually have to write out the solution and run it to make sure it works). I think this is a good way to more-frequently refresh my memory of solutions without spending too much time per day. I want the whole thing to take ~15 minutes.
Thought: It’s sometimes not clear when I “know” a solution. Like, I solved a linked list question, but then I saw in the LeetCode explanation that there was a more-efficient solution. So when do I review the card? In a day, to see if I remember the efficient solution? Or in 6 months, when I might have forgotten my less-efficient solution?
2024.08.27 - What I’m doing now is aiming to spend an hour per day on the NeetCode 150, where my goal is to memorize the approach used to solve the question rather than try to “figure out” the question on my own. So I will spend a few minutes trying to remember the approach to solve the question, if I can’t then I just look at the LeetCode Editorial (both explanation and code) and set the due date of the card for 7-14 days from now to see if I can remember the approach at that time. Once I can remember the approach, I then see if I can write out the code. So I’m using the same Anki card to test my memory of two different things: 1) if I can remember a plain-English description of the approach for solving the question, and 2) if I can remember both the plain-English description and also how to write out the code.
If I end up with extra time from solving a question quickly, I’ll keep going with more NeetCode 150 Anki cards until the hour is up.
2024.08.29 - I think the way to do this is:
For each card, just read the editorials of the different approaches and set the card for a day or two out and just make sure you can remember the various different approaches. Interviewers apparently want to see that you can identify the different possible approaches to a problem.
Once you can identify the different approaches, set the card for maybe 7-14 days out to see if you can actually write out the ideal approach (and maybe one of the unideal approaches?).
2024.08.31 - If you’re limited on time on a certain day, try to make at least 20 minutes of progress so you don’t lose momentum. Just watch the LC Editorial video for a question, or just watch the Neetcode video and try to summarize what you see, or just skim the explanations and try to fill out the Anki card with a summary of what you read. Or just read the explanation of one of the possible approaches in-depth and try to summarize it.
Example: I’m taking several days to work on the LRU Cache LC question.
The first day I just opened the editorial, skimmed it, and wrote out on the Anki card “1. doubly-linked list. TODO: fill this out” and “2. use built-in data structures like OrderedDict(). You need to know how to use the .move_to_end() and .popitem() methods”.
The second day I skimmed the explanation of the first approach in greater detail so that I could fill out a summary on the Anki card.
My idea is that on a third day I’ll skim the code of one or both explanations and then try to write it out from memory afterwards (immediately afterwards? after 30 minutes? I’m not sure yet).
2024.09.06 - I had a good idea today: sometimes I’ll solve a question using the ‘normal’ approach but still feel like the card is worth reviewing to test myself on an alternative way of solving the question. So today I just made a note of what solution I wanted to try next time in my note in the answer section of Anki, and made a similar note in the ‘Question’ section so that the next time I see the card I’ll remember what my goal is:
2024.09.07
When you’re first learning a question, you can use Anki to just test your ability to read the prompt and identify the major issues / tricky things about the question, and the possible approaches for dealing with them. Don’t even bother trying to code the solution until you’ve internalized that much.
I’m also thinking it might make sense to skip the Hards until you’ve first gone through and become comfortable with the Mediums in every topic. I think I’m going to skip the Linked List Hards until I get through the rest of the questions first.
I’m also thinking I should maybe add all the Easy questions to the list of daily practice problems that I randomly choose two from to write out.
2024.09.10 - I’m working on a binary tree LC question (LC 102) and it’s occurring to me while I try to step through an example that I don’t have a clear system for writing out the state of the variables as I step through the code while keeping comments to the side. For example, I’m just now deciding that I will try to keep track of the state of variables where they’re defined, and then if there’s a .pop from it, I just add a comment pointing up to indicate that I should look above for the state.
But it’s making me realize that there should really be a standardized / best-practice way of doing this, but it’s not something I see even being discussed online.
2024.09.10
When I first see a new problem, I don’t have it as a goal to write out the code. I just want to spend literally a few minutes thinking about it to see if I can guess what the correct approach is, and then I look at the LC Editorial / NC video to see what the correct approach is. I think this is a much nicer way to do things, much less likely to waste your time. Previously I would spend hours trying to figure it out on my own. You don’t want to waste time going in the wrong direction.
I’m looking at LC 572, supposedly an “Easy”, and the latter two solutions in LC are both closer to being Hard questions. So I guess I shouldn’t worry about memorizing every possible solution to a question as I move through questions. Maybe set it as a later goal.
Try solving questions on LeetCode that are labeled as being asked by the company you’re trying to apply to.
Data structures
Questions
What is the threshold at which point it's a good idea to create a class for a data structure?
Learning Resources
Algorithms-Specific Informational Websites (not programming challenge websites)
Nick Parlante - Pointers, Binary Trees, etc.
Very good explanations
OK explanations
Books
Look good:
The Algorithm Design Manual by Skiena
Rec'd by Adam D'Angelo in a Quora post.
Essential Algorithms - 5 stars on O'Reilly
Working with Algorithms in Python - O'Reilly video series - 4.3 stars - Same guy who wrote 'Algorithms in a Nutshell'
Maybe good:
Meh:
Articles
Interview Challenge sites
Front-end interview challenge / prep sites
Interview Prep sites
rec'd by Choketsu
Codility
Rec'd by Toptal
rec'd by Choketsu
via Tom A.
Non-Interview-Prep Challenge sites
Rec'd by Toptal
Someone who works with machine learning explaining how Kaggle is a simplification of real ML work:
Jason B Hill's Project Euler solutions
Rec'd by Toptal, Chris Uga
rec'd by Choketsu
How to get started training at the USACO Gateway:
What is it like to attend the USACO training camp?
It's only a week
New people just get 3 hours of lab/lecture in the morning
The top guys do a contest every day. Sometimes the contest goes until after lunch, otherwise they just review after lunch. (Seems like 5 hrs/day)
Then everyone has fun for the rest of the day
If one is stuck on a particular section, is it worthwhile to keep on trying to solve the given problem, or should one move on to other sources in preparation for programming contests?
Richard Peng, one of the most successful Canadian IOI competitors of all time, on the USACO training pages:
"USACO training was put together before the major IOI escalation (Finland/Korea/Wisconsin). A lot of the techniques described on it are no longer useful on *OI and a lot of the 'hot' topics over the past few years are not covered. Also, a lot of the bottle necks on training are quite meaningless, and they typically cause a lot of frustration and time waste on the scale of months."
Algorithms Unlocked
Summary
1. What Are Algorithms and Why Should You Care?
2. How to Describe and Evaluate Computer Algorithms
3. Algorithms for Sorting and Searching
4. A Lower Bound for Sorting and How to Beat It
5. Directed Acyclic Graphs
6. Shortest Paths
7. Algorithms on Strings
8. Foundations of Cryptography
9. Data Compression
10. Hard? Problems
Topics
Linked Lists
Databases
Dynamic Programming (DP)
https://activities.tjhsst.edu/sct/lectures/0405/dp1.pdf
"Dynamic programming, often abbreviated "DP", is a technique to efficiently solve recursion problems -- problems whose solutions depend on solutions of smaller problems -- by storing partial results. Many times, it is obvious that a given problem is a recursion problem. However, it may not be so obvious that using DP will tremendously speed up the solution. We start by looking at an example."
Heaps
Recursion
Trees
Binary Trees
The interview process
http://seldo.com/weblog/2014/08/26/you_suck_at_technical_interviews
http://blog.triplebyte.com/how-to-pass-a-programming-interview
https://medium.com/@evnowandforever/f-you-i-quit-hiring-is-broken-bb8f3a48d324#.jhferyy53
Flashcards
Here is an explanation of how to programmatically create Anki deck packages: http://decks.wikia.com/wiki/Anki_APKG_format_documentation
Here's an example of people collaborating on an Anki deck: http://decks.wikia.com/wiki/Main_Page
Javascript problem manager: https://github.com/imaginate/algorithmIV-question-manager
System Design
Step-by-step process
Clarify the task.
Share what you know about the system with the interviewer.
If you’re not familiar with the system, tell the interviewer so they can explain how it should work.
Constrain the problem to make it solvable within the hour.
List the different metrics you could consider tracking / focusing on and explain the one you think we should focus on and why.
Total users, DAU, WAU, MAU. Maybe daily minutes per active user.
Uber: driver ratings, trip completion rates, and acceptance rates, number of rides completed per day, number of cancelled rides, number of rider complaints raised, average loading time, ratio of daily active users to monthly active users, number of daily rides booked per daily active user, average revenue per user, average wait time
Clarify the scale of usage, the amount of data involved, the rate of requests.
Specify some key metrics to help your high-level decision-making.
Example: the number of users, the amount of data.
Try to determine the number of requests per day, divide by 100,000 (seconds per day) to get the average requests per second, and take a guess at the peak factor (how much times the number of requests per second you might see in peak traffic), take a guess at how many requests per second a typical server might be able to handle (say, 1000), and use that to determine how many servers you’ll need.
It doesn’t have to be perfect, the idea is to see that you’re aware of these ideas.
If you’re going to be using auto-scaling servers, that’s fine.
(Source)
You don’t necessarily have to go into the details of scaling at this point.
Design the system at a high level.
At an early stage, it’s OK to have a few simple components; later on you can get more complex.
Example:
If asked to design Spotify, you could draw a mobile app connected to a load balancer which connects to a group of webservers, which connect to the database.
Outline your database design.
After laying out your components, design your database.
Split out databases appropriate for different types of data.
In general, try to explain why you’re doing something upfront rather than waiting for your interviewer to ask you.
Go through a normal use-cases and explain/flesh out how it will work in greater detail.
Consider how the webservers will retrieve data from the db and how they’ll send it to the end-user; will they use websockets?
If you have more time, think about how things could “go wrong” in terms of scaling.
Examples:
When designing Spotify, some songs might be requested by users for listening far more than others.
Solutions:
A CDN
Learning resources
Abbott & Fisher
APIs: A Strategy Guide: Creating Channels with Application Programming Interfaces
The Art of Capacity Planning
Written by the manager of data operations for Flickr.com
First edition:
The Art of Capacity Planning: Scaling Web Resources
Second edition:
The Art of Capacity Planning: Scaling Web Resources in the Cloud
ByteByteGo / Alex Xu
https://www.designgurus.io/ - Grokking the System Design Interview (online course)
Designing Distributed Systems: Patterns and Paradigms for Scalable, Reliable Services
microservice-api-patterns.org - This looks really interesting.
NeetCode system design courses - https://neetcode.io/courses
https://drive.google.com/file/d/1U7EchvgzCjTtF5JzGVVCFgjXC-qw7lIG/view?pli=1 ← They send you a link to this ebook when you sign up for their newsletter.
https://www.algoexpert.io/systems/product
Negative review: https://www.reddit.com/r/cscareerquestions/comments/s0e54f/warning_systems_expert_by_the_algoexpert_guy_is/
“modules are very light on information, and you'd have a better time just reading the wikipedia page on those topics.”
“topics aren't well explained and it's just an extremely high level overview of what they are”
Udemy: https://www.udemy.com/courses/search/?src=ukw&q=system+design
API Design Patterns
Summary
PART 1: INTRODUCTION
1 Introduction to APIs
2 Introduction to API design patterns
PART 2: DESIGN PRINCIPLES
3 Naming
4 Resource scope and hierarchy
5 Data types and defaults
PART 3: FUNDAMENTALS