AndrewNikitin/Jtangle

<!> CategoryWorkInProgress

  1. J script
  2. Original Perl script
  3. Perl script
    1. Scan entire file
    2. Select and unwrap top-level section
    3. Warn about unused sections
  4. Discussion

For some reason

wget -U "Mozilla" -O- http://www.jsoftware.com/jwiki/AndrewNikitin/Literate?action=raw | perl jtangle.pl >z.ijs

garbles end of lines.

J script

Original Perl script

jtangle.pl.txt

Perl script

Current literate parser attaches J-style comments (NB.) on top of file. Please remove this comment, perl script will not run properly with it.

«perl»=
# 2007-03-19
use strict;
use bytes;
use Getopt::Std;

Default behaviour is to ignore filenames and dump everything to STDOUT. The problem is that filename may contain relative paths and attempt to overwrite system files via

{ { {#!literate name='c:\autoexec.bat'
... something sinister

This way the harm will be done during source extraction stage wich is less expected. Need to implement some kind of checking mechanism.

«perl»=
our(
 $opt_f, # use filenames from sections, otherwise dump everything to STDOUT;
);
getopts("f");
«perl»=
my %piece;
my $section;
my $PREFIX;
my @STACK;
my %unwrapped;

Scan entire file

For entire each line in a file grab lines into %piece hash containing named sections in form of arrays of strings.

«perl»=
my $CLOSE='}' x 3; # kludge to work around current literate parsing
while(<>) {
  my $n;
  if( $n=/^\s*{{{#!literate.*name='([^']*)'/ .. /^\s*$CLOSE\s*$/ ) {
    if( 1==$n ) {
      $section=$1;
      $piece{$section}=[] unless exists $piece{$section};
    }

Perl's .. (range operator) returns 'E0' attached to the position number when line matches final expression. This does not change position's numeric value but gives something to look for to test for final expression.

«perl»=
    elsif( 'E0' ne substr($n,-2,2) ) {
      push @{$piece{$section}},$_;
    }
  }
}

Select and unwrap top-level section

Scan through named sections and recursively unwrap those that contain '.' in their name. $PREFIX contains string to prepend to indented sections (for now only whitespaces). @STACK contains list of pending sections to detect self references.

«perl»=
close STDOUT if $opt_f;
for my$s(keys %piece) {
  if( 0<=index($s,'.') ) {
    $PREFIX='';
    @STACK=();
    open STDOUT, ">$s" if $opt_f;
    unwrap($s);
    close STDOUT if $opt_f;
  }
}

Procedure that recursively unwraps sections

«perl»=
sub unwrap($)
{
  my $s=shift;
  if( !exists $piece{$s} ) {
    warn "Section $s is referenced but not defined. Nothing is substituted";
    return;
  }

If name of a current section is already in @STACK then substitution will never finish. Give warning and ignore this occurence of section.

«perl»=
  for my$e(@STACK) {
    if( $s eq $e ) {
      warn "Recursion detected: $s";
      return;
    }
  }

For each line of section either output it (with prepended $PREFIX) or, if it is section reference, recursively unwrap it. For now there can be only one section reference per line and nothing but whitespace is allowed around it.

«perl»=
  $unwrapped{$s}++;
  push @STACK,$s;
  for my$l(@{$piece{$s}}) {
    if( $l=~/^(\s*)«(.*)»\s*$/ ) {
      my $p=$PREFIX;
      $PREFIX=$p.$1;
      unwrap($2);
      $PREFIX=$p;
    } else {
      print "",$PREFIX,$l;
    }
  }
  pop @STACK;
}

Warn about unused sections

In the end check if any of the named sections were not used by unwrap and give warning.

«perl»=
for my$s(keys %piece) {
  if( !exists $unwrapped{$s} ) {
    warn "Section $s is defined but never used";
  }
}

Contributed by AndrewNikitin

CategoryLiterate


Discussion

(Notwithstanding that it's work in progress.) Literate/Wiki Tool is implemented as a MoinMoin plugin. It has, naturally, its own "tangle". Which is also naturally implemented in Python. It's not a question about whose choice of langauge of implementation is better, but of practical nature: Wouldn't having another an alternative Perl implementation be duplicating the effort? The Literate Wiki Tool will be evolving and it only make sense to have the same code base for tanlge, that will used in both places: stand-alone script and Wiki plugin. -- OlegKobchenko 2007-03-20 19:01:55

I need perl script for my internal process anyway. Besides, I do not have python installed on any of my machines and will not have in forseeable future and "duplicating effort" on one page script does not seem like such a waste to me. BTW, if you post your python parser, preferably in literate form, I will try to ensure that perl and j implementations match it as close as possible. -- AndrewNikitin 2007-03-20 19:10:13

It is published, where it should: in parser market of MoinMoin. Making it Literate is a good idea. I don't know how complicated parser installation process is at MoinMoin web site, but having some experience and a few rounds of improvements here at J Wiki, will help them get convinced. -- OlegKobchenko 2007-03-20 19:47:40

last edited 2007-03-25 12:16:13 by OlegKobchenko