Hack hack hack...

An open journal-- some of it written for you, but most of it is for me.

Intro to React

React event

  • similar to normal event handling (responds to target and such, but has other properties)

Refs and the Dom

Props

  • They are the mechanism used in React for passing data from parent to child components
  • Props can’t be changed from inside the component; they are passed and “owned” by the parent.

State

  • React’s components can have mutable data inside this.state.
  • when the state is updated, the component triggers the reactive rendering, and the component itself and its children will be re-rendered. As mentioned, this happens very quickly due to React’s use of a virtual DOM.

Component to action to store

  • the store is subscribed to the action
  • components subscribed to stores -> re-render
  • props are immutible -> given the state of the application, what doe the component look like?

  • ?what is the value of the actions step -> ususually just passing through to the store

  • user inputs triggers the action

  • inputs into the system -> actions
  • action updates the state
  • which updates the components

  • store is more flexibile than backbone collections and models

  • based on the change what literal DOM manipulations are needed -> taken care of virtual DOM

How Linux Works: What Every Superuser Should Know

Posix shell and Utilities

fork() & exec()

  • When running exec() programs, you are forking a shell and running it on that
  • Fork and exec
  • Why do shells call fork?
    • exec can only run one thing. So shell creates the child shell runs exec and then returns.
  • fork() clones the current process, creating an identical child. exec() loads a new program into the current process, replacing the existing one. From

Time slice

  • time slice#Time_slice) managed by the scheduler of the CPU. Time slices are variable -> need this to keep the illusion of concurrency(having multiple processes at the same time) and multi-tasking, but with one core there is no possibility of doing paralellism.

Builtins

  • (wikipedia)[https://en.wikipedia.org/wiki/Shell_builtin]
    • executed directly in the shell itself, instead of an external executable program which the shell would load and execute.
    • Shell builtins work significantly faster than external programs, because there is no program loading overhead.
    • most notable example is cd
    • cd has to be a builtin because the shell itself needs to change its “cwd” - current working directory - not a sub-process. The goal of “cd” is to change the current working directory of the shell itself, and that can’t be accomplished from a child process without a lot of special hackery which would end up being more complex than the builtin. from
      • great explanation on how it came to be with a quote from Dennis Ritchie

Memory Management

  • swap -> using the hard disk as RAM
  • A page table is the data structure used by a virtual memory system in a computer operating system to store the mapping between virtual addresses and physical addresses. Virtual addresses are used by the accessing process, while physical addresses are used by the hardware, or more specifically, by the RAM subsystem.

Tree

Vimmin

Command-T was killing me. Vim was built against the system ruby while the command-t plugin was set to the default. So I had to jump through a few hoops described here. A pain. Done though.

More Ops Notes

Network partitioning causing exchange issues on rabbitmq

  • this was after netowrk maintenance
  • solution was to delete exchange which then is automatically added back documented here

Ubuntu 16

  • ubuntu 16 doesn’t ship with upstart so sysctld is what we’ll use to control services. systemctl commands
    • can also run from /etc/init.d/service start
  • rsyslog is what we use to get the logs to papertrail
  • used the oracle version of Java 8

Digital ocean

  • backups (auto) are different than snapshots (not auto) and different in pricing as well

Chef

Project Aristole

Reading Notes

  • decentralized control
  • manager surveys
    • different mediums to collect feedback
  • no exact patterns
  • rapport building -> chit chat, care about others as people as much as co-workers
  • groups norms are stronger than the individual, even if the individual is strong, driven and accomplished
  • ?what is data of a strong group?
  • following up on hurtful interactions (saying the things that go unsaid)
  • Manager:
    • humility, my team is smarter than me -> vulnerability
    • fulfillment of creating a great team
    • setting communication norms
    • empathy norms and quick follow up

Crash Course DevOps

Wires, cables and WiFi

  • Bits sent as light beams, no signal loss
    • faster than copper
  • WiFi is radio and then translated into physical bits over the wires
  • Bitrate -> bits per second -> how fast it can transmit
  • bandwidth -> how much data can you receive over a period of time
  • latency -> how long it takes, more hops to talk to Seiji than Kaitlin in the next room; ping time * distance

  • traceroute to see hops

IP addresses and DNS

  • ISP
  • IP is a unique address
    • IPv4 - 4 billion unique addresses isn’t enough
      • 32 bits, 8 octets
      • 32 bits long, 8 bits for each part of each address
      • DHCP -> obtaining a lease renewal process on your network.
      • to the outside world, Flatiron has the same IP provided by ISP
      • router provides guests with temporary IP depending on how long your lease renewal is set
      • renegotiate IP ?- subnet mask
      • 255.255.255.255 is the max it could go, based on the octet
      • static IP -> unique in the world
      • private IP -> re-use private IP addresses behind networks
  • IPv6 - 128 bits instead of 32

    • hex representation to prevent us running out of static IPs
    • DNS servers are split up in TLDs

      • DNS on a router
    • google’s DNS will break it up by the TLD and domain

    • google DNS versus open DNS
    • ?www doesn’t mean anything any more, just another subdomain
    • ?CS of how the domain names at the DNS are stored?
  • TTL

    • this record will expire at X time
    • set TTL to an hour, it won’t have to look it up for an hour
    • DNS propagation
  • DNS -> record type

    • A record -> address
    • maps the word name the domain to a physical IP address
    • PTR -> pointer -> the reverse, input IP and get the domain
    • FQDN -> A fully qualified domain name (FQDN) is the complete domain name for a specific computer, or host, on the Internet.
      • Host name is gmail and .com is the TLD
    • CNAME -> mail.google.com is a CNAME of googlemail.l.google.com.
      • A records should be unique
    • why is there a dot at the end?
      • it is an absolute as opposed to the relative address
    • want multiple MX records, redudency strategy

Packet, routers and reliability

  • fault tolerant
  • TCP
  • ?How do they determine the route?

    • routers own the ISP and the companies that own them
  • traceroute domain.com

  • wireshark is a GUI of tcpdump

  • IP address can change and it has a serial number or a MAC address (hardware address)

HTTP & HTML

  • SSL and TLS (successor)
    • http listens on 80 and https listens on 443
    • sslv3 has been deprecated
  • certificate authority vouches for site
  • OpenSSL generates keys, authority will sign that
  • self-signed certs like for nagios or letsencrypt will work fine
  • SSLS
    • discrete logorithm problem
  • handshake process -> how to keep box locked and send secret
    • put treasure in box, lock box, send to friend
    • friend puts lock on box, sends it back
    • I remove lock from box and send back
    • friend unlocks box
  • SSH
    • ssh forwarding sends private key along
    • asymetric only used for authentication
    • symetric is then used for encryption and decryption
  • bcrypt is what we use for email

DDoS

DevinOps in review

  • Octopus gem -> pooling, load balance select statements to read only replicas
  • DB sharding
    • complex SQL joins, will put different tables in different DBs
    • replication
  • floating IPs points to the master
    • static IP never changes
    • use floating IPs rather than DNS because of propogation
  • usernames:
    • if you have a server on linux it should not be running on root
    • if it’s a system process then it would be ok to run on root
    • postgres is run as the postgres user
      • apache is run by deployer and passenger -> how ruby achieves paralellization (a module for apache)
    • deployer for Ironboard -> easy way to manage keys
    • how owns the process, who owns the directory where the config files lives
  • etc/passwd -> all the users on the box and who is logged in
  • every service that has a port open to the internet it should have its own user
  • username is usually the service name
  • /var is usually the log directory
  • /etc is usually the config directory
  • WAL-E
    • ship our binary logs we sent to s3
    • write ahead logs -> must be complete before actually modifies the DB with a transaction. This is the mechanism for rollbacks.
      • master just sends the write-ahead logs to the slave which then just replays them
  • Backups
    • take a base and then the have these WAL backups and send them off and it will apply all the deltas
    • can give us point in time restoration
    • these logs will fill up our disk
    • ? how long do we keep WAL-E
    • interface for deleting backups, s3 logs
    • config of pg is pg_hba.conf
    • script to switch over primary/replication script turns recovery.conf to recovery.done
    • failover.sh on the other replica as root
      • breaks replication, now need to configure replication on the new replication server
    • iptables -> firewall

Permissions

  • User, Group, World -> multiple users on the same machine, make sure processes don’t do stuff they aren’t supposed to
  • chmod 755, 644
  • all directories have to be 7
    • directories point to other files
  • cd is a program that takes a directory (file) as an argument and executes it
  • etc/sudoers -> determines who has root permissions; visudo -> creates a backup and prevents brackage
  • useradd give them a shell and name the user; want to add to the chef script

load balancer

  • front-end is learn.co, terminating SSL, in the
  • backend ->
  • should be root on

Debugging

  • passenger-status --show=requests
  • passenger-status

Automatic provisioning

cron

  • sends mail

IDE backend

  • backend application -> IDE umbrella
  • main one is learn-ide-server cookbook
  • different chef servers
  • wombat
  • traffic cop
  • elixir-build01 -> build elixir on server because linux
  • new server needs to be added to HAproxy
  • lb -> balance url_param token hashing algorithm that determines which server the IDE connects to based on their Learn oauth token. (Optimization -> make hashing algo more granualar so that if a machine gets taken out the rotation, the othere aren’t rebalanced as well
  • Load balance assignment is determined by dynect traffic management
    • own health monitoring
  • gluster is fancy NFS
    • RAID -> redundant array of independent disks
  • rkt
  • PTY - psuedoterminal, terminal emulator, connect to an arbitrary shell
  • inotify -> FS event watcher
  • containerization
    • aufs loser -> unionfs winner that was merged into linux
  • ? do you bump archived tar’ed files when you archive again

IDE Umbrella

  • for adding IDE server (node list)
  • dependencies in mix.exs; cowboy is the webserver, poolboy
  • applications are like slightly fancy supervisors, distillery release packaging
  • mix deps.get -> get dependencies
  • iex -S mix
  • :cowboy.start -> erlang library, cowboy is lib which is an atom
  • gen_server, an abstraction on top of processes in general
    • provides state and a little behavior
  • agent is just for state
  • every connection gets its own process
  • distillery builds them
    • release -> full replacement of previous release, must restart
    • upgrade -> release plus upgrade instructions, relup file, modify the code in memory -> hotswap; preferable to release
    • mix edeliver version qa
    • mix edeliver version production
    • mix edeliver build release
    • mix edeliver deployer release to qa
    • mix edeliver restart qa
    • bin/ide start
    • bin/ide attach -> quit with ^D, not ^c (will kill process)
    • bin/ide remote_console
    • mix edeliver build upgrade --with=0.1.0+232424323-hdhfd
    • mix edeliver deploy upgrade to qa -> will prompt for version number

Chef

  • server provisioning
    • consistency, automation
  • inheritance in the cookbooks, base cookbook -> security
  • like ruby base cookbooks
  • wrapper cookbook, like forking a cookbook without modifying it
    • increasing levels of abstraction
  • conventions of chef aren’t very clearly defined
  • differences between recipes and cookbooks?

Elastic Search

Intro

“elastic search is full text sreach engine, non-relational DB, analytics engine”

  • has clustering and managed over REST
  • Suggest Explicit mapping

  • keywords -> non-analyzed data

  • full text -> analyzed

  • filters -> will need bool then filters

relevance -> score meta data field for document match

  • things are ranked
  • filters are faster and don’t have relevance (so if you don’t care, go with fiters)
  • can boost relevance

multi-index multi-type

  • lots of power

Aggregation -> group by on steroids

  • SQL: group by the bucket
  • aggregator -> can do nested group by and data retrieval

Managing elastic search

  • clustering
    • odd number of nodes bigger than 1
      • 3 shards and replication
      • a third node can shuffle the shards and the replicas

ELK Stack

  • elastic search, logstash, and kubana
  • logstash
    • want to parse and stash log data
  • kubana
    • front end visualizer
  • now beats added -> lightweight, written in go -> for shipping

Split Brain issue

Tinc

  • VPN -> encrypts traffic between servers

Setting up elastic search

Preparing Data for Machine Learning

Why machine learning - you can predict the future

  • Explosion of data, more available than ever
  • sheer processing power available at a reasonable price point

Data Preparation Tools

  • Rstudio -> data prep and execute the machine learning
  • jupyter notebooks -> python in the cloud
  • excel -> stick with the tools you know
  • azure machine learning studio
  • scikit learn (competitor to r studio) for python
  • relational database tools (SQL queries)
  • sed / awk
  • python, r, sql most common languages to clean up data

Data Cleaning

  • missing values in data or repeating values in data (blanks, null, n/a, unknown, 999999)
  • what’s your business goal?
    • SPCA - can you help us predict what animals won’t get adopted?
    • got enough data to be significant, you can delete rows -> not always an option
  • can substitute a specific value (can go with a worst case scenario)
  • fill forward and fill backwards based on the last value we read -> going to have to write code
  • R is.na()
  • python pandas.fillna(), pandas.isnull()

  • excel tips ->

    • select individual columns, go to special, select blanks (will select every blank cell) and you can enter a value
  • repeated values

    • causes a bias in the data
    • duplicate row vs duplicate IDs
    • r duplicated()
    • python dataframe.drop_duplicates
    • SQL DISTINCT or correctlated queries
    • correlated subquery -> if the rows are the same but the IDs are different when you find two identitical records and delete the one with the lower ID number

Data Transformation

  • Decomposition
    • the more you know about the data, the better
    • one column represents two or more values (like addresses lumped together and finding the city and state)
    • return a 1 or 0 to represented spayed and nuetered
  • Aggregation

    • how many copies of different books to keep in stock?
    • don’t want the day it was sold, want total number of books sold per wk/mon x date
    • select count(*) from sales group by (books sold per period)
  • Scaling

    • predict how long a patient admitted the flu will need to stay in the hospital
    • because the age has a wider varience in age you don’t see the lower differences like temperatute
      • create ranges -> normalizing, standardizing

Conclusion

  • Training data versus test data

    • subject matter experts become important as they need to undetr
    • tumors -> which do you want, one with more false positives and one with more false negatives
  • slides at aka.ms/confooml

  • data science and machine learning essentials
  • cleaning data with python
  • building your first machine learning experiment

The Soul in the Machine - Developing for Humans

With Christian Heilmann

Machines vs. Humans

  • machines don’t get tired or make mistake as they fatigue
  • law is boring and machines don’t get bored
  • The future of employement -> company secretaries
  • machines handle grunt work and humans handle human decisions -> the more abstract the less likely you are to be replaced by a computer
  • the more predictable we are as programmers, the more likely we are going to be replaced
  • AI software that makes AI software
  • past being factory workers and finding the way to add value
  • the saddest aspect of life right now is that science gathers knowledge faster than society gathers wisdom.
  • just making money isn’t enough anymore
  • Orwell predicted cameras everyone but didn’t predict we would buy the camera
  • “Technological progress has merely provided us with more efficient means for going backwards.” -> Aldous Huxley
  • “the power of big data and psychographics”
  • inclusive design set that microsoft released
  • it’s not about allowing access but avoiding barriers
  • spotlight - show me my documents larger than 20 pages
  • netflix -> fast movement can be compressed more than slow movement
  • Aipoly -> identify objects on the phone, not running on the internet
  • facebook open sourced identifying objects
  • AI lip reading for 46% accuracy, humans have 12% accuracy
  • imagenet and open images dataset -> image data sets, properly trained data sets to play with
  • captionbot.ai

Intro to DNS

With Maarten Balliauw

  • people know should more about DNS and the http level

DNS 101

  • How the internet works

    • IPv4 or IPv6
    • Check own operating system files to see if the host file has a record is known
    • will check the DNS cache for the local machine
    • OS will ask the router and check its own host file and DNS cache
    • same process for the ISP
    • the ISP will then go to the root server, address of the authoritative server
    • 6 hubs look ups

    • nslookup google.com

    • dig A google.com +trace

    • two types of servers

      • authorative (owns the domain)
      • cache (recursor) -> resolves the domain for you
    • DNS protcol designed in 1983
      • designed to map a domain name to an IP address
      • added TXT records and IPv6
    • TLD managed by separate organizatons (verisign, canadian internet registrartion authority)
      • all make their own rules e.g. need to be a canadian to be a .ca domain name, transfer rules
    • hierarchical system:
      • hit . first
      • then TLD like org, com, ca
      • can also create maps within a specific domain and could create own hierarchy like google does
    • caches
      • TTL -
      • cannot clear cache at ISP
      • keep the old IP address to maintain both
  • DNS zones

    • UDP protocol
    • only 13 root servers across the world
    • root-servers.org
    • $100k/yr to get own TLD
    • ? where does the money for those fees go for buying a TLD?

    • a text file and are hierarchical

    • SOA -> start of authority
    • Name of authoritive master name server (NA)

    • CNAME - redirect at the DNS record

    • MX - find mail server at specific address
    • TXT - validate domain ownership/spam rules
    • SRV - descibes a service type and port (like for in network printer)
    • PTR - reverse DNS

    • zone transfers

      • most IPs require more than 1 nameserver
      • master name server and that will sync with slave nameservers
      • no authentication and may expose internal services

Security

  • old protocol
  • cache poisoning
  • DNSSEC - checks the origin - certificate chain
  • most modern browsers are checking for DNSSEC records

  • DDoS

    • lots of open resolvers out there
    • DNS amplication for DDos
    • disable recursion

DNS in application archtecture

  • DNS failover and load balancing
  • add multiple DNS records -> will be a poor man’s load balancer because it will return an random record
  • intelligence DNS server (CNS)
  • configuration in DNS
    • use DNS as a configuration store
    • DNS record could point to a TXT value
  • service discovery

DNS for fun and profit

  • Abusing DNS

    • public hotspots
    • proxy server translating HTTP
  • iodine - same HTTP over DNS - tunnel traffic

    • code.kryo.se/iodine