From netaxs.com!jcostom Wed May 10 10:52:11 1995 Path: netaxs.com!jcostom From: jcostom@netaxs.com (Jason Costomiris) Newsgroups: netaxs.www,netaxs.general Subject: ASCII to HTML Date: 10 May 1995 12:41:20 GMT Organization: Net Access - Philadelphia's Internet Connection Lines: 85 Message-ID: <3oqc9g$j2h@netaxs.com> NNTP-Posting-Host: unix3.netaxs.com X-Newsreader: TIN [version 1.2 PL2] Xref: netaxs.com netaxs.www:585 netaxs.general:3687 Another useful looking one.... [ Article crossposted from comp.infosystems.www.authoring.html ] [ Author was Henry Churchyard ] [ Posted on 10 May 1995 07:17:45 -0500 ] In article <3op5sf$9ln@news.voicenet.com>, wrote: > A brief question. I have a ASCII listing of a bunch of >HTTP sites that I would like to convert to HTML and then >add text and graphics. Can anyone suggest the best program >to accomplish this?? Here's a very simple awk script to prepare text for inclusion into a HTML file by changing URL's in a plain-text file into into HTML links to that URL (you may need to use the metachar=1 option -- the documentation is included as commentsin the program source). =============================================================================== #url2html.awk -- Takes a plain text file which contains URL's, and outputs # HTML code with links to those URL'S. The beginning of a URL is # signalled by "[a-z]+://", and cannot contain any "<", or ">" # characters, or whitespace. Trailing punctuation is stripped from # URL's before embedding in . Cannot handle URL's # split over more than one line. This program can be useful for # importing plain text into an HTML file. # # Use: # # awk -f url2html.awk infile.txt > outfile.forhtml # # awk -f url2html.awk metachar=1 infile.txt > outfile.forhtml # #(You may need to use nawk or gawk instead of awk.) # # Uncomment the following "BEGIN{metachar=1}" line (i.e. delete the # line-initial "#" character), or add "metachar=1" to the command line, if you # want "<", ">", and "&" characters in the input file to be translated to HTML # ampersand entities for safety. # #BEGIN{metachar=1} # /:\057\057/{line="";currsrch=1;txtbeg=1; while (match(substr($0,currsrch),/[a-z]+:\057\057[^<> \t]+/)!=0) {x=substr($0,(currsrch+RSTART-1),RLENGTH);rleng=RLENGTH for (i=RLENGTH;i>=1;--i) {z=substr(x,i,1); if ((z==")")||(z==".")||(z==",")||(z==":")||(z==";")||(z=="]")||(z==">")||(z=="\042")) {x=substr(x,1,(length(x)-1));--rleng} else {break}}; currsrch=(currsrch+RSTART); if (currsrch>(txtbeg+1)) {inert=substr($0,txtbeg,(currsrch-(txtbeg+1))); if (metachar) {gsub(/&/,"\\&",inert); gsub(/>/,"\\>",inert); gsub(/" x "")} if (txtbeg<=length($0)) {inert=substr($0,txtbeg); if (metachar) { gsub(/&/,"\\&",inert); gsub(/>/,"\\>",inert); gsub(/<&]/)) {gsub(/&/,"\\&");gsub(/>/,"\\>");gsub(/.sig

Henry Churchyard


-- Jason Costomiris | "...You'll sweat grease." jcostom@netaxs.com | My employers like me, but not enough | --overheard outside the new to let me speak for them. | restaurant, "Everything Pork." http://www.netaxs.com/~jcostom