Check these pages for language-specific notes:

Other languages


HTML header

<html lang="en">
  • Here lang is an 'attribute', with value en.
  • html is the tag.
  • head has metadata, one of which is title (to fill the tab bar).
  • html has tags for 'article', 'header', and 'figure' now!


  • Use classes to segregate your content. Call it with a leading '.' in css. e.g.
.site-nav-header { width: 300 px }
  • ID's on the other hand can only be used once per html page. use with leading '#' in css. e.g.
##main-title { color: green }


<label for="nickname">Please enter your nickname</label>
<input type="text" id="nickname" name="nickname">
  • The label's for should match the input's id
  • The name is what is passed to the backend as a variable name
  • input type can be a lot of things, like 'email' or 'submit'

  • All of these: labels, inputs etc are inline-block elements and are therefore stacked horizontally.

  • To align them better, use div's, which are container tags that break up these horizontal elements into vertical stacks.

There are 3 groups of elements in the way the browser stacks them:

  • inline: span, em, strong (all treated horizontally)
  • block level: p, div, article (browser inserts CRLF)
  • inline block level : input, textarea (can be resized)

tcl: xml parsing example

package require tdom
set dom [dom parse $XML]     
set recording [$dom documentElement]
set datamode [$recording firstChild]
set session [$datamode nextSibling]
$session attributes *
$session getAttribute session_id
set participant [$session nextSibling]
set dom [dom parse $XML]     
set recording [$dom documentElement]

Other study notes


  • Use SO_REUSEADDR when stopping/starting servers: the OS will keep a socket alive for ~4 minutes after it's closed in case it has to retransmit FINs/ACKs.
  • Deadlocks occur if the OS buffers fill up in both ends. e.g. client send blocks of data, server has a recv(1024) and processes and sends data (say, 1024), but client only has a recv of say, 10. Then his buffer gets filled up since he's getting a lot more data than he can handle. so the server's sends stop working. similarly the server's recv fills up …
  • Either can call socket.shutdown, e.g. if you're a client who's finished sending data and wants to notify this. The connection stays open so he can continue getting data. shutdown's flags will stop reads, writes, or both.
  • Address families: pretty much always AF_INET. AF_UNIX is for local file sockets. bluetooth etc also exist. Oh AF_INET6 also exists.
  • Socket type: SOCK_DGRAM (2) and SOCK_STREAM (1). (each address family has its own udp/tcp equivalents under the dgram/stream types)
  • Last field is protocol which can be zeroed/ignored. IPROTO_TCP is 6 and IPROTO_UDP is 17. but we can infer it from the socket type above so we don't need to set it each time.

  • To avoid v4/v6 and other binding confusions, use getaddrinfo(host,port) : it returns FTPCA (family, type, protocol, canonical name and address)

  • Use socket.getservbyname(53) to see port->service mappings.

  • Similarly socket.gethostbyname('abc.com') or gethostbyaddr('')
  • So self ip address is socket.gethostbyname(socket.getfqdn())

Unicode in DNS:

  • RFC 3492 specifies the IDNA codec that maps a unicode hostname to an ascii representation.
  • The lookup is performed for the encoded ascii string only.


  • utf-8: 1-4 bytes. use setlocale() to switch encodings

example of unicode encoding:

  • character "ยข"= code point U+00A2 = 00000000 10100010 โ†’ 11000010 10100010 โ†’ hexadecimal C2 A2
  • explanation: if the actual code is 00000000 10100010, then the representation starts with a 11 (to show that 2 bytes are needed to represent this character), followed by the data. The continuation bytes always start with a 10.

example 2:

  • The following string contains 4 utf-8 characters:


  • D4 converted to Hex is 11010100 which tells us (from the first 2 bits) that 2 bytes take up this character. similarly F0 == 11110000 which takes 4 bytes to specify the next character, and so on.

  • Do not modify strings directly in C (the compiler may store multiple identical string literals in the same address, so modifying one will affect the other)

Levenshtein Distance

  • Used in fuzzy searching (e.g. 'git lgo' which autocorrects and recommends 'log')
  • Used to measure the difference between two strings

Algorithmic Complexity

Notation Type Example
O(1) Constant Time Dict Lookup
O(lg n) Logarithmic Binary Search
O(n) Linear Iterating over a list
O(n log n) Log Linear Optimal sorting of arbitrary values
O(n2) Quadratic Comparing *n* objects to each other
O(n3) Cubic Floyd and Washall's algorithms
O(nk) Polynomial *k* nested loops over *n*
O(n!) Factorial Producing every ordering of *n* values