06: Concurrency and Streams
July 21, 2020
This article is a part of a series where I go through teachyourselfcs. If you would like to start at the beginning start here.
Have a host, connects with the “internet address”. Then you also have a port on that host to connect to.
Running out of internet addresses (4 bil), not because of 4 bil computers buts because of unassigned addresses.
To solve this ipv6 was created.
https: secure http
socket is an abstract datatype: 2 way internet connection.
three way handshake: client sends to server, server sends to client, client sends back to server.
Your processer has memory inside of it to do computations with data very fast (ram). If second load called before first store happens you can get a concurrency issue.
Have to “lock out” the variable while it is being used.
Serializer: takes a procedure and returns a protected procedure. Prevents procedures from overlapping.
Better to crash than to get wrong answer, because wrong answers pile up without you noticing them.
Inefficiency: not using parrallelism at all
Deadlock: 2 threads locked waiting for eachother to finish
unfairness: process A wins and process B can’t win.
Correct answer. Consistent within sequential order of the evaluated threads.
Same answer as no parallelism.
serial: one after the other
Most programs aren’t programmed with deadlock in mind, because processes are very fast.
Figure out about the deadlock from thread maintenance in the database, then you kill one of the threads.
Resource starvation: several threads want resource. One thread keeps getting the resource.
mutex: object that attempts to aquire and release
memoization: remembering result of expression and using that to build the next expression.
Stream uses JIT (just in time) to where it calculates the value whenever it is needed.
Only use streams if program is functional.
Parts of the interpreter and computer are non-functional, but they provide a layer of abstraction at the program level that allows for safe multiprocessing if you write a functional program.
Software that had a bug that caused some cancer patients to die of overdose.
Software doesnt degrade, but has problems with reliability because of the amount of possibilities that could go wrong that the real world doesnt have. Many layers of abstraction create a black box that can have underlying problems.
No limit to how complicated a software program can get.
Therac was multithreaded, must do error checking.
There were error messages that werent really errors, when a real error occurs operators will bypass because it is likely not a big deal.
Documentation, didn’t help people understand issues that happened.
Initially people thought it was operator error since the machines most of the time (we know concurrency issues can cause this). Sometimes it actually was operator error, but it was because of poor user interface design and bad error messaging.
They found error and fixed it, but just because you found a real error doesn’t mean more errors can’t be within the program. They shipped the bugfix without extensive testing to make sure the issue is really resolved.
mapReduce is really mapSortReduce, sorts one value then reduces it.
groupReduce: accumulates all values of a key.
find most common word in song. Want to do with multi processing, find most common word for each starting letter. can use single processer to check the 26 results that the multiprocessed calculation gets.
Referential Transparency: The ability to replace a subexpression with the values.
Several processes can attempt to access same state variable, this makes order matter, so you have to use sequential order.
example of multiprocessing complications: 2 processes of 3 ordered events. There are 20 possibilties of how these events can be executed. Programmers must think about all possibilties. This creates huge complications when thinking about an entire system.
serialization: Processes will execute concurrently, but some will collections will be done sequentially. While the procedure is being executed it locks out other procedures from being able to access it.
Deadlock: 2 procedures stalled because they rely on eachother to finish the procedure.
Delayed evaluation: allows ability to represent a very large sequence as streams.
If large amounts are represented by a list it can create huge problems with both time and space complexity.
Delay: packages an expression into a procedure that it can call later.
Force: calls the procedure that delay creates