Structured programming 4 Day 34 LING 681.02 Computational Linguistics Harry Howard Tulane University

Preview:

Citation preview

Structured programming 4Day 34

LING 681.02Computational Linguistics

Harry HowardTulane University

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

2

Course organization

http://www.tulane.edu/~ling/NLP/

Structured programming

NLPP §4

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

4

Today's topics

Defensive programmingDebuggingAlgorithm design

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

5

Defensive programming

Brainstorm with pseudo-codeCareful naming conventionsBottom-up construction

Functional decomposition

Comment, comment, commentRegression testing

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

6

Brainstorm with pseudo-code

Before you write the first line of Python code, write what your program does as pseudocode.

That is to say, before writing a program that NLTK understands, write it in a way that people understand.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

7

An example of pseudo-code

SPOT, move forward about 10 inches, turn left 90 degrees, and start moving forward, then start looking for a black object with your ultrasonic sensor, because I want you to stop when you find a black object, then turn right 90 degrees, and move backward 2 feet, OK?

What is good or bad about this example

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

8

A different phrasing of the example

SPOT, move forward about 10 inches and stop.Now turn left 90 degrees.Start moving forward, and turn on your ultrasonic

sensor.Stop when you find a black object.Turn right 90 degrees and stop.Move backward 2 feet and stop.What is good or bad about this example?

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

9

Pseudo and real code

The main advantage of the second phrasing is that we can match up the commands in each line to elements in the programming language.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

10

Careful naming conditions

Choose meaningful variable and function names.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

11

Bottom-up construction

Instead of writing a 20-line program and then testing it, build and test smaller units,and then combine them.

In general, these smaller units should be functions.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

12

NLP pipelineFig. 3.1

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

13

Commenting

Add comments to every line, unless what a line is does is so obvious that a

comment would get in the way.

Your pseudo-code could become the comments on your real code.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

14

Regressive testing

Keep a suite of test cases.As your program gets bigger, it should still work

on previous test cases.If it stops working, it has 'regressed'.

A change in code has the (unintended) side effect of breaking something that used to work.

doctest module does testingIt runs a program as if it were in interactive mode. See doctest documentation.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

15

Debugging topics

Check your assumptionsException > stack traceInteractive debuggingPython's debuggerPrediction

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

16

Debugging

"Most code errors result from the programmer making incorrect assumptions". (NLPP:158)

When you find an error, first check your assumptions.

Add print statements to show values of variables and how far the program progresses.

Reduce input to smallest amount needed to cause the error.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

17

Stack trace

A runtime error (Python exception) gives a stack trace that pinpoints the location of program execution at the time of the error.

But the error may actually be upstream.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

18

Python's debugger

Invoke it:import pdbpdb.run('mymodule')

It lets you monitor execution of program,specify line numbers where program should stop

(breakpoints), andstep through the sections of code inspecting values of

variables.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

19

Prediction

Try to predict the effect of a potential bugfix before re-running the program.

"If the bug isn't fixed, don't fall into the trap of blindly changing the code in the hope that it will magically start working again." (NLPP:159)

For each change, try to articulate what is wrong and how the change will fix the problem.Undo the change if it doesn't work.

"Programs don't magically work; they magically don't work." (Robert Goldman)

Algorithm design

NLPP 4.7

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

21

Algorithms

Divide and conquerStart with something that worksIterationRecursion

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

22

Divide and conquer

Divide a problem of size n into two problems of size n/2.

Binary search - dictionary example.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

23

Start with known

Transform task into something that already works. To find duplicates in a list,

first sort the list, then check for identity of adjacent pairs.

16-Nov-2009 LING 681.02, Prof. Howard, Tulane University

24

Iteration vs. recursion

For some function ƒ…Iteration

Repeat ƒ some number of times.Calling ƒ in a for loop.

Recursionƒ calls itself some number of times:

NP → the N PP.PP → P NP.

Next time

Start NLPP §6

Learning to classify text

Recommended