sed Exercise

The file consists of text with markup, or “tags” that tell how the text is to be displayed. An opening tag is a command enclosed in angle brackets < and >. For example, <para> means “start a paragraph”. The corresponding closing tag that means “end a paragraph” is written </para> (note the slash after the less than sign).

Here is the document that you will be modifying; just copy and paste it into a text file on your system.

<article> <title>About the Web</title>

<para> This is an article about the World Wide Web. The World Wide Web is a collection of documents that are linked to one another. The Web is <emphasis>not</emphasis> the same as the Internet. The Internet is a world-wide network of networks, and it does far more than simply serve up Web pages. </para>

<para>Tim Berners-Lee, the inventor of the World Wide Web, put special emphasis on the portability of web pages. Rather than create a proprietary format, he made Web pages dependent only upon plain ASCII text.</para>

<para> Web pages are written in a markup language called HTML. Here is what it looks like. The &lt; and &gt; mark off elements. </para>

<listing> &lt;body&gt; &lt;div id=“top-navig”&gt; &lt;a id=“top”&gt;&lt;/a&gt; &lt;a href=“index.html”&gt;ULI 101 Index&lt;/a&gt; &amp;gt; Assignment 1 &lt;/div&gt;

&lt;h1&gt;Assignment 1&lt;/h1&gt; &lt;p&gt;This exercise shows you how to use the two computer environments that you will use in this class. You will:&lt;/p&gt; &lt;ol class=“upper-roman”&gt; &lt;li&gt;Set up your directories on Windows. This is where you will write your HTML documents.&lt;/li&gt; &lt;/ol&gt; </listing>

<para>It looks difficult, but it is possible to learn HTML in a few weeks. <emphasis>You, too can create web pages for viewing by friends and family!</emphasis> Note that, in our listing, we had to encode &gt; as &amp;gt. </para> </article> Write a sed file that does the following. It should work on any html or text file, not just this one. That means you can’t count on a particular tag always being on a particular line.

  • Lines with <article> and </article> should be deleted.
  • Replace <title> with Title:, and replace </title> with nothing
  • Replace all <para> and </para> tags with the null string. If the resulting line is empty, delete the line. (You may need to use curly braces to make this happen.
  • Replace all <emphasis> and </emphasis> tags with asterisks. Thus:

This is a <emphasis>great</emphasis> bargain. will become This is a *great* bargain.

  • Replace the word web with Web everywhere.
  • Replace lines starting with <listing> by —begin listing
  • Replace lines starting with </listing> by —end listing

Between the <listing> and </listing>, do these things (you must use curly braces to do this!):

  • Replace all occurrences of &lt; with <.
  • Replace all occurrences of &gt; with >.
  • Replace all occurrences of &amp; with &.

Note: you must do these operations in the order shown above; otherwise, you will get the wrong results! Note: The & character is a special metacharacter when used in the “replacement” portion of a substitution. For example, if you want to replace the word “and” with “&”, you would do this:


Submit the commands and changed file

Awk Exercise

The following is contained is a text file called employee.txt:

100 Thomas Manager Sales $5,000
200 Jason Developer Technology $5,500
300 Sanjay Sysadmin Technology $7,000
400 Nisha Manager Marketing $9,500
500 Randy DBA Technology $6,000

Using awk, indicate what commands will accomplish the following:

  1. Prints every line in the file
  2. Prints the lines that contain Thomas and Nisha
  3. Prints the second file and last field in the file
  4. Prints the lines that indicate and Employee Id (1st field) that is greater than 200
  5. Prints the employees that are in the Technology department
  6. Prints the employees that make more than $5500 and less than $9500
  7. Prints the number of employees in each department

Write down the commands with the sed submission

Total 2% Due December 6, 2018

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 3.0 Unported