CS35L Compiled Notes

Embed Size (px)

DESCRIPTION

UCLA CS35L Final Study Sheet

Citation preview

  • Week 1 Linux and Vi/Emacs-man(ual) pages

    documentation that comes preinstalled on Unix-based operating systems

    -wh... commands whatis : returns name section of man page whereis : locates binary, source, and manual

    page files for command

    -using find command -type: type of file -name: name of file find [starting directory] [options] [arguments]

    -symbolic links vs hard links touch command creates file hard link: ln target link_name points directly to original file symbolic link: ln -s target link_name points to target, not necessarily location of original file on

    disk like shortcut in Windows

    -other commands ps: list running processes kill: terminate process by PID kill PID cron: schedule periodic tasks diff: outputs differences between two files wget: retrieve content from web server

    Week 2 Shell Scripting- Locale

    Set of parameters that define a users cultural preferences language, country, region-specific settings

    - Environment Variables variables accessed from any process HOME: path to users home directory PATH: list of directories to search in for command to execute can be changed by export VARIABLE =

    - locale command gets data from LC_* environment variables

    LC_TIME: Date and time formats LC_NUMERIC: non-monetary number formats

    - locale settings can affect program behavior e.g. sort order

    - The C Locale default behavior behavior of Unix systems before locales

    - sort, comm, tr sort: sort lines of text files sort [OPTION] [FILE] sort order depends on locale C locale: ASCII sorting comm: compare two sorted files line by line comm [OPTION] FILE1 FILE2 comparison depends on locale

    tr: translate or delete characters tr [OPTION] SET1 [SET2]

    - shell and OS user interface to OS accept commands as text common shells: sh, bash, csh, ksh

    - Compiled vs Interpreted

    compiled: translates source code to machine language to be executed by hardwareefficient and fastwork at low levelrequire recompilinge.g. C/C++

    interpreted: carries out command immediatelymuch slower executionhigh-level,easier to learnportablee.g. PHP, Ruby, bash

    - Scripts shell script file: file w/ shell commands when shell script is exe a new child shell process is

    spawned to run first line of script states which child shell to use by

    prepending #! to line # comment line

    - Execution Tracing shell prints out each command as it is executed turn on exe tracing by set -x: to turn it on; +x to turn off

    - Variables declared using = referenced using $ (e.g. $var) Built-In Shell Variables # Number of arguments given to current process ? Exit status of previous command HOME home directory $[number] to print argument #[number] (e.g. $1 = first arg) if [number] > 9 put i braces (e.g. {10})

    - if Statements put statement in [ ] with space before and after statement

    inside fi to close if statement -gt: greater than (e.g [ $COUNT -gt 0 ])

    - Exit Return Values 0: command exited succesfully >0: failure to execute command 1-125: command exited unsuccesfully 126: command found, but file not executable 127: command not found > 128: command died from receiving signal

    - Quotes backticks(` `) : executed as commands temp=`ls` ; echo $temp : literal string : like single quotes but expand `, $, and \

    - Loops command enclosed in do done

  • while [ ]do

    echo Value of count is: $COUNTlet COUNT=COUNT-1

    done temp = `ls`

    for f in $tempdo

    ~~~done

    f refers to each word in temp=`ls` output

    - Output echo: writes arguments to stdout, cant output escape

    characters (w/o -e) printf: output data with complex formatting printf [format] [thing to format] printf %.3e\n 46553132.14562253 e = scientific notation; .3 = 3 decimal places

    - File descriptors every program has 3 file descriptors to interact with read input from stdin(0) normal output to stdout(1) stderr(2)

    - Redirection and Pipelines program < file: makes programs stdin be from file program > file: makes programs stdout be written to file program 2> file: makes programs stderr written to file program >> file: appends programs stdout to file program1 | program2: assigns stdout of program1 as the

    stdin of program2

    - Regular Expressions notation that lets you search for text with criterion \: escape next character .: match any single character except NUL *: match any number of the single character that immediately

    precedes it ^: only at beginning of line $: match preceding expression to end of line Examples grep ... tolstoy: seven letters tolstoy, anywhere on line ^tolstoy: only if beginning of line tolstoy$: only if at end of line ^tolstoy$: line only contains tolstoy [Tt]olstoy: Tolstoy or tolstoy tol.toy: tol[char]toy tol.*toy: tol[char(s)]toy

    - grep: returns lines containing match to regex

    -sed: replace parts of text sed `s/regExpr/replText/` [FILE]

    - Text Processing Commands wc: outputs a one-line report of lines, words, and bytes head: extract top of files tail: extracts bottom of files

    - Regular Expressions

    Quantification: how many times of previous expression Grouping Alternation Anchors

    -Inodes data structure that stores info about files identified by unique inode number if hard-linked, same inode number check using ls -i [filename]

    Week 3 Modifying Programs-Compilation Process:

    .cpp (C++ preprocessor+header files) expanded source code (compiler) .s assembler file (assembler) .o object code file (linker+library function object code) executable file

    - Makefiles compiles files and makes sure theyre up-to-date only (re)compiles when it needs to [target] : [prerequisite]

    [command] all : [prerequisite] #recompile everything in makefile clean: [no prereqs] # rm -rf *.o a.out ##remove all object files

    and executables check: #cmp and diff

    -Build process1 configure generates Makefile checks package dependencies2 make compiles code and create executables3 make install copies executable to system directories

    -diff files --- defines original file that was changed

    +++ defines modified file @@ -, +,

    each @@ section called a hunk - line removed + line added

    -patching patch -pNum < patch.txt

    -Python scripting first line: #! /bin/python define interpreter import optparse library: used to parse command-line options arguments, options, and option arguments lists dynamic (size not fixed) and heterogeneous (objects of

    different types) list = [element1, element2, ] to iterate through list: for i in list for loops for i in list:

    print i for i in range(len(list)):

    print list[i]

  • check # of arguments: len(sys.argv) sys.argv[0]: name of program in bash, $ does not include the name of the script as an arg def (args): Print to std output sys.stderr.write(msg) Class : def : def __init__ (self, args): #constructor; not required; will run

    when class created

    Week 4 Change Management- Centralized vs. Distributed VCS

    Single central copy of theproject history on a server

    Changes are uploaded toserver

    Other programmers can get changes from the server

    E.g. SVN, CVS Pros: everyone can see

    changes at same time, simple to design

    Cons: Only stored in one central place (single point of failure)

    Each developer gets the full history of a project on their own drive

    Developers can communicate changes without going thru central server

    E.g. git, mercurial, bazaar, bitkeeper

    Pros: commit or revert changes while offline, faster because accessing local hard drive, share changes with selected ppl

    Cons: long time to download, uses a lot of space

    - Terminology repository: a data structure usually stored on a server that

    contains a set of files and directories as well as the full history and diff versions of a project

    working copy: a local copy from a repository at a specific time or revision

    check-out: create local working copy from repository commit: write from working copy to repository

    - Git Architecture blob: files(sequence of bytes) stored in .git/objects indexed by unique hash tree: filesystem directories can include other git trees or blobs commit: created when git commit called points to top-level tree of project at point of commit contains name of committer, time of commit and hash of

    current tree tags: given names to commit objects head: a reference to commit object HEAD: refers exclusively to currently active (checked out)

    head branch: refers to head and entire history of ancestor commits

    preceding that head master: default head name that refers to default branch

    created in repository

    - git Commands git init //creates new repository git clone //gets a copy of an existing repo git add //add files to index git commit //changes are added to repo git help, git status, git diff, git log, git git checkout //checkout specific version/branch of tree

    git revert //reverts commit but does not delete commit object git diff //shows changes made compared to index git status //shows list of modified files git branch git checkout -b //checkout

    particular branch

    Week 5 C Debugging/Programming-Debugging Process

    reproduce the bug simplify program input use debugger to track down problem fix problem

    -Debugger program used to run and debug other programs

    -GDB GNU Debugger can debug C, C++, Java, Objective-C compile using -g flag gcc [flags] -g -o

    -Set Breakpoints break file1.c:6 program will pause when it reaches line 6 of file 1.c break break [position] if expression info breakpoints/b show list of breakpoints delete [breakpoint #] remove breakpoint disable [breakpoint #] if no arguments all breakpoints affected ignore [breakpoint #] iterations ignore breakpoints for iterations number of iterations When program reaches breakpoint c(ontinue): continue until next breakpoint n(ext): exectue next line as single instruction s(tep): functions executed line by line f(inish): resume until current function returns; also shows

    return value and next statement

    -Displaying Data print [/format] [expression] formats d: decimal x: hexadecimal o: octal t: binary

    - Watchpoints watch watchpoints pause program when variable changes

    - Analyzing the Stack in GDB bt/backtrace shows call trace/stack without function calls: #0 main() at program.c:10 one frame on the stack, numbered 0, belonging to main (gdb) info frame displays information about current stack frame, including its

    return address and saved register values

  • (gdb) info locals lists the local variables of the function corresponding to the

    stack frame, with their current values (gdb) info args list the argument values of the corresponding function call

    - Basic Data Types int hold integer in 4 bytes float holds floating point numbers in 4 bytes double holds double-precision floating point in 8 bytes char holds a byte of data, characters void NO BOOL

    - Pointers variables that store memory addresses dereference using & double x, *ptr; //declare double x and pointer to double ptr ptr = &x; //store address of x into ptr *ptr = 7.8; //set value of x to 7.8 function pointers (functors) double (*func_ptr) (double, double); func_ptr = pow; // func_ptr points to pow()

    //

    structs no classes in C used to package related data of different types together struct Student {

    };

    struct Student s; typedef struct {

    } Student;Student s;

    cannot have member functions (so no constructors) no access specifiers (public and private)

    -Dynamic memory memory allocated at runtime allocated on heap void *malloc(size_t size); allocates size bytes and returns a pointer to the allocated

    memory void *realloc(void *ptr, size_t size); changes the size of the memroy block pointed to by ptr to

    size bytes void free(void *ptr); frees the block of memory pointed to by ptr

    -Reading/Writing Characters int getchar(); returns next char in stdin int putchar(character); writes a character to the current position in stdout

    -Formatted I/O int fprintf(FILE *fp, const char *format, ); int fscanf(FILE *fp, const char *format, ); fp can either be a file pointer or stdin/out/err

    -Compiling a C program gcc [OPTIONS] c file -o [OUTPUT NAME] compile binary with [OUTPUT NAME] -g include symbol and source-line info for debugging

    Week 6 SSH Encryption & Signatures- SSH

    secure shell used to remotely access shell successor to telnet with encryption and better authenticated

    session

    - Encryption types Symmetric Key Encryption shared/secret key key used to encrypt is same as key used to decrypt Assymetric Key Encryption: Public/Private 2 different (but related) keys: public and private only creator knows relation private key cannot be derived from public key data encrypted by public key can only be decrypted by

    private key and vice versa public key can be seen by anyone but private key should

    never be published

    - SSH Protocol login by ssh username@somehost if first time talking to server -> host validation ssh doesnt know about host yet shows hostname, IP address and fingerprint of servers

    public key after accepting, servers public key is saved to

    ~/.ssh/known_hosts host validation next time client connects to server, check hosts public key

    with saved public key; warning if they dont match validates using asymmetric encryption encrypt message with public key if server is true owner, it can decrypt message with private

    key session encryption client and server agree on symmetric encryption key

    (session key) messages sent between client and server encrypted and

    decrypted using same session key client authentication password-based authentication prompt for password on remote server if username exists and password is correct then access is

    allowed key-based authentication generate key pair on client copy public key to server to ~/.ssh/authorized_keys server authenticates client if it can decrypt message

    encrypted to with clients public key private key can also be protected with a passphrase, which

    will be prompted everything you ssh to a host

    - ssh-agent (passphrase-less ssh) program used with openssh that provides a secure way of

    storing the private key ssh-add prompts user for passphrase once and adds it to list

    maintained by ssh-agent

  • once passphrase is added to ssh-agent, the user will not be prompted for it again when using SSH

    openSSH will talk to local ssh-agent daemon and retrieve private key from it automatically

    Digital Signatures

    - Secret key (symmetric) cryptography single key used to both encrypt and decrypt a message

    - Public key (asymmetric) cryptography two keys used: public and private if a message is encrypted with one key, it has to be

    decrypted with the other

    - Digital Signature an electronic stamp or seal that is appended to a document ensures data integrity (document not changed during

    transmission) creator signs with private key, user verifies with public key detached signatures stored and transmitted separately from the message it signs commonly used to validate software distributed in

    compressed tar files

    Week 7 System Calls-Processor Modes

    operating modes that place restrictions on the type of operations that can be performed by running processes

    user mode: restricted access to system resources kernel mode: unrestricted access hardware contains a mode-bit (0 for kernel mode, 1 for user)

    - User mode vs Kernel Mode User mode CPU restricted to unprivileged instructions and a specified

    area of memory for untrusted user code Kernel mode CPU is unrestricted, can use all instructions and access all

    areas of memory for trusted kernel/OS code purpose is for OS to maintain protection from malicious code

    and fairness of resource use some instructions are privileged and can only be executed

    by trusted code to ensure I/O Protection to prevent users from performing illegal I/O

    ops memory protection to prevent users from accessing illegal

    memory and modifying kernel code and data structures prevent user from hogging the CPU

    - Trusted Code kernel is core of operating system software that executes in

    supervisor mode kernel code is trusted implemented with protection mechanisms cannot be changed using untrusted software in user space kernel executes privileged operations on behalf of untrusted

    user processes, serving as the interface between hardware and software

    - System Calls special type of function that:

    part of the kernel of the operation system changes the CPUs mode from user mode to kernel mode to

    enable more capabilities used by user-level processes to invoke OS functions verifies that the user should be allowed to do requested

    action and then does the action (kernel performs the actual operation)

    only way for user program to perform privileged operations control transferred from untrusted user process to trusted

    OS process is interrupted and control is passed to kernel expensive and can hurt performance system has to interrupt process and save state, pass control

    to OS, OS performs action, OS saves its state, return controlto user process

    - Library Functions functions that are part of the standard C library library equivalent functions have less overhead than directly

    using system calls getchar and putchar vs. read and write fopen and fclose vs open and close library functions still have to make system calls usually make fewer system calls than system functions less mode switched less overhead

    - Unbuffered vs. Buffered I/O unbuffered every byte is executed (read/written) by kernel through a

    system call buffered collect as many bytes as possible into a buffer and read

    more than single byte into buffer at a time and use one system call for block of bytes decrease number of system calls

    fwrite() copies outgoing data into local buffer as long as it isnt full and returns to caller immediately. when buffer spaceruns out, fwrite() call write() to flush buffer to make room- time and strace

    time [options] command [arguments] real time: time as read from wall clock user: cpu time used by process sys: cpu time used by system on behalf of your process strace prints out system calls to stderr strace ./test strace -o output_file ./test

    Week 8 Buffer Overruns- Stack Frame

    logical block pushed when calling a function popped when returning contains: parameters to functions local variables data necessary to recover program state frame pointer points to fixed location within frame variables are referenced by offsets to the frame pointer

    - Calling a Function Push arguments Push return address

  • Copy stack pointer into frame pointer to create new fp and save old fp on stack

    Decrement sp to reserve space for local variables and state information

    - Buffer Overflow a buffer is a contiguous block of memory that holds multiple

    instances of the same data type stuffing more data into a buffer than it can handle results in

    overflow can be taken advantage of to execute arbitrary code

    - Exploiting Buffer Overflow allows us to change return address of function change flow of execution and execute arbitrary code place code we are trying to execute into buffer we are

    overflowing overwrite the return address so it points back into buffer

    Week 9 Multithreaded performance- Multiprocessing

    use of multiple cpus/cores to run multiple tasks simultaneously

    - Parallelism executing several computations simultaneously to gain

    performance different forms of parallelism multitasking: several processes schedule to perform

    alternately or possibly simultaneously on a multiprocessing system

    multithreading: same job is broken logically into pieces whichmay be executed simultaneously on a multiprocessing system

    - threads flow of instructions smallest unit of processing scheduled by OS process consists of at least one thread multiple threads can be run on: uniprocessor: processor switches between different threads multiprocessor: multiple processors/cores run threads a

    same time

    - Multitasking each process has its own address space communicate using system calls in a multithreaded program, each thread also gets its own

    address space with the process global data, code, heap andunique stack

    allows threads to easily access and share data thread creation and destruction less expensive than process

    creation and destruction but non-trivial (synchronization and race conditions)

    - Multithreading vs Multitasking1 Multithreading threads share same address space lightweight creation/destruction easy inter-thread communication error in one thread can bring down all threads in process2 Multitasking

    processes are insulated from each other expensive creation/destruction expensive interprocess communication error in one process will not bring down another process

    POSIX Threads

    #include

    - basic pthread functions pthread_create: creates new thread within process return values: 0 success failure int pthread_create( pthread_t *tid, const pthread_attr_t *attr,

    void*(my_function)(void *), void *arg ); tid: unique id for thread attr: NULL for default my_function: function that thread executes once created arg: single argument to my_function (use struct for more

    than one arg) pthread_join: waits for thread to terminate before continuing int pthread_join(pthread_t tid, void **status); tid: thread id of thread to wait on status: NULL if no status needed pthread_equal: compares thread ids to see if they refer to

    same thread pthread_self: returns the id of the calling thread pthread_exit: terminates the currently running thread