No Such Blog or Diary
apple lines (あなごる)
- 2008-12-29 (Mon)
- プログラミング ( sed/wake/awk )
130Bから縮まない.相変わらず数字をディクリメントする部分が長すぎる.入力に特化してしまえばもう少し縮まる気もするけれど… 負けだよなぁ.
それはさておきそろそろ帰る支度をするか.
- Comments: 0
- TrackBack (Close): -
gawkでソート
- 2008-07-13 (Sun)
- プログラミング ( sed/wake/awk )
asort とかいう関数があるではないか.知らんかった.でも,比較関数を指定するにはどうしたらよいのやら?
- Comments: 0
- TrackBack (Close): -
渡り鳥なsed
- 2008-04-01 (Tue)
- プログラミング ( sed/wake/awk )
エイプリルフールなのでジョークプログラム.実行されるたびに同じディレクトリの *.sed ファイルを渡り歩く sed のコード片.たぶん一番古いファイルに移動しようとする.今いるファイルの書き込み権があるのにわたり先の書き込み権がないと海に落ちて死ぬ.鳥がくっついても元のスクリプトの動作に支障はない(はず).
#start_of_mcode 1{x s!.*!sed 'y/\\x00/ /'</proc/$PPID/cmdline!e s!.* \(\w\+\.sed\).*!\1! H s/.*/ls -t *.sed/e s!.*!(sed -e '/^#start_of_mcode/Q1;$s/.*/&\\n/p;d'<&)!emg /^\n*$/!{ G s/\(.*\n\).*\n\(.*\)/\2\n\n\1\n/ s/\([^\n]*\n\)\(.*\)\n\1/\2\n/ s/\n*$// s/.*\n// H s!.*!sed ''<&>/var/tmp/migratory_sed_temp!e g s!.*\n\(.*\)\n\(.*\)!(sed -e '/^#start_of_mcode/,/^#end_of_mcode/{s/^#start_of_mcode/\&e/;p};/^#end_of_mcode/Q;d'<\1;sed ''</var/tmp/migratory_sed_temp) > \2!e g s!.*\n\(.*\)\n.*!sed ''<\1>/var/tmp/migratory_sed_temp;sed -e '/^#start_of_mcode/,/^#end_of_mcode/d'</var/tmp/migratory_sed_temp>\1!e } s/.*// x s/\n.*//} # # v v # (+.+) # //,,) # #end_of_mcode # the bird never migrates to a zero-sized script
- Comments: 0
- TrackBack (Close): -
topological sort @sed
- 2008-03-23 (Sun)
- プログラミング ( sed/wake/awk )
書いてから気づいた.辞書順って出力文字列の辞書順か.点の番号での辞書順じゃないのね.最初に辞書順に並び替える部分書き加えればいいのだろうけど面倒なのでやらない方向で.どちらにせよ,最後の入力は時間オーバーだし.
G h $!d g s/^/\n/ : h s/\(.*\n\)\(\w*\):\n\(.*\)/\2/p G s//\1\3/ :1 s/^\(\(\w\+\).*\) \2\b/\1/ t1 s/.*\n//m /:/b d
固定文字列に対してはマッチした部分の消去が楽に出来るけど,動的な文字列にマッチした部分の消去ってのはまだうまくかけない.これの速度を上げられないといろいろな問題で困る.どうしたもんかなぁ.
それはさておき研究室に行ったのにネットワークが使えないってのは仕事すんなってことかな?
- Comments: 0
- TrackBack (Close): -
MST by Kruskal & Prim @sed
- 2008-03-21 (Fri)
- プログラミング ( sed/wake/awk )
追いコンで「Primの方が単純ではないか」という指摘をもらったので試しに sed でも prim を実装することにした.
とりあえずオリジナルの Kruskal バージョン.大半がソートに取られている.もっと簡単で速いソートがほしい.
# input: # a set of edges # each line: v_in v_out weight # # output: # a set of edges marked with use-flag # eachline: useflag v_in v_out weight # # useflag = T if the edge is used # F if the edge is not used # # the order of edges in the output is not the same as the input # edges are sorted by weights # H $!d s/.*// x s/\n// h # make a set of vertices s/^\(\w* \w* \).*$/\1/gm :Z s/\b\(\w\+\)\b\(.*\b\1\b\)/\2/ tZ s/[ \n]\+/\n/g s/^\n*\|\n*$//g x # here, # hold space: vertices # pattern space: edges # rearrange to weight,v_in,v_out s/^\(\w*\) \(\w*\) \(.*\)$/\3,\1,\2/gm #=================== start of merge sort by key ============== # input: key1,value11,value12,...value1n\nkey2,value21,value22,...value2n\n... # output: sorted items # # a key is a number (i.e., key = [0-9]\+) # a value does not contain symbols (i.e., value = \w*) # s/$/ /gm :a /\n/!bq # each line is sorted: key1,value11,value12,...,value1n key2,value21,... # consequtive items are separated by a space # each item has a key and values # key is on the head, followed by values s/^\(.*\)\n\(.*\)$/@\1: %\2/gm # each line: @item11 item12 ... item1n : %item21 item22 ... item2m # ':' is a separator of two lists # '@' is the head of non-merged part of the first list # '%' is the head of non-merged part of the second list :b # each line: sortedpart @item1i ... item1n : %item2j ... item2m # already done s/^[^%@:\n]*$/~&/gm # done (2nd list is consumed completely) s/^\(.*\)@\(.*\): % *$/~\1\2/gm # done (1st list is consumed completely) s/^\(.*\)@: \(.*\)%\(.*\)/~\1\2\3/gm /@/!be x s/$/\n==mergesort_by_num_key==/ x H # ignore merged lists s/^~.*$/~/gm # prepare to comparation s/^.*@\(\w*\)\(,\w*\)*.*%\(\w*\)\(,\w*\)*.*$/\1 \3/gm #================= start of num_comparator ================ # # num11 num12\nnum21 num22\n...\n~\n...\nnumn1 numn2 # # each line is replaced with a relation # numk1 < numk2 -> < # numk1 > numk2 -> > # numk1 = numk2 -> = # a line of '~' is ignored # preserve the hold space x s/$/\n==num_comparator==/ x H # comparation by lengths :1 s/^ $/=/gm s/^ .*/</gm s/.* $/>/gm s/^[0-9]\|\b[0-9]//gm t1 s/\n//g G s/\n.*==num_comparator==// x s/\n==num_comparator==.*// x :2 s/\`=\(.*\)\n\(.*\)$/\1#\2/m s/\`<\(.*\)\n\(.*\)$/\1#</m s/\`>\(.*\)\n\(.*\)$/\1#>/m s/\`~\(.*\)\n.*$/\1#~/m t2 # comparation by values s/#/\n/g s/\n// s/^/#/gm :3 s/#\(.\)\(.*\) \(.\)\(.*\)/\1\3 #\2 \4/gm t3 s/#//g :4 s/00/=/g s/0[1-9]/</g s/[1-9]0/>/g y/123456789/012345678/ /[0-9]/b4 s/ //gm s/=*\(.\).*/\1/gm #================= end of num_comparator ================ s/\n//g G s/\n.*==mergesort_by_num_key==/#/ x s/\n==mergesort_by_num_key==.*// x # [<>=~]*#\nmergedpart @rest_of_list1 %rest_of_list2\n... # according to the results of comparations held in the head, # move a head of either the rest of list1 or list2 to the tail of merged_part. :m s/^[<=]\(.*\)\n\(.*\)@\(\(\w\|,\)*\) \(.*\)%\(\(\w\|,\)*\)/\1-\2\3 @\5%\6/m s/^>\(.*\)\n\(.*\)@\(\(\w\|,\)*\)\(.*\)%\(\(\w\|,\)*\) /\1-\2\6 @\3\5%/m s/^~\([^\n]*\)\n~/\1-/ tm s/-/\n/g s/#\n// bb :e s/~//g ba :q #=================== end of merge sort by key ============== s/ $// s/ /\n/g # restore the order s/^\(.*\),\(\w*\),\(\w*\)$/\2 \3 \1/gm x # hold space: sorted edges # pattern space: vertices # make singleton sets s/^.*$/& &/gm x s/^/#/ s/$/\n/ x # main loop # # use marked edge ('#' on the head) to union sets :Y # finished? x /#$/bX x s/$/\n==end_of_vertices==/ G s/==end_of_vertices==.*#\(\w* \w*\) .*/\1/ s/\(.*\)\n\(.*\)/\2\n\1\n/ #============== start of unionfind ======================== # # input: namex namey\nname1 parent1\nname2 parent2\n...\n # # output: root flag\nname1 parent1\n...\nnamex root\n...\nnamey root\n... # where root is the representive name of the set # parents of namex and namey are set to root # flag is F if namex and namey have already been in the same set # T otherwise (i.e., sets are unioned) # # find the root of the first element :A s/^\(\w\+\) \(.*\n\1 \(\w\+\)\b\)\([^#]\)/\3 \2#\4/ tA # find the root of the second element :B s/^\(\w\+\) \(\w\+\)\b\(.*\n\2 \(\w\+\)\b\)\([^#]\)/\1 \4\3#\5/ tB /^\(\w*\) \1\b/{s//\1 F/;bD} /^\w\+ \(\w\+\)\b.*\n\1 \1#/{s/^\(\w*\) \w*\b/\1 T/;bD} s/^\(\w*\) \w*\b/\1 F/ :D # refine the parent-relation :E s/^\(\w*\)\( .* \)\w*#/\1\2\1/ tE # add flags s/\n$// #============== end of unionfind ======================== # marking the edge x s/$/\n==end_of_edges==/ G s/#\([^\n]*\n\)\(.*\)\n==end_of_edges==\n\w* \(.\).*/\3 \1#\2/ x # remove return value of unonfind s/.*\n//m bY :X s/// s/\n$//
これのコメントをなくしてホールドスペースの保護用のマークを小さくすれば 1600B 弱になる.
次に,Kruskal のエッジソート後の処理を変えて作った Prim の実装.
# input: # a set of edges # each line: v_in v_out weight # # output: # a set of edges marked with use-flag # eachline: useflag v_in v_out weight # # useflag = T if the edge is used # F if the edge is not used # # the order of edges in the output is not the same as the input # edges are sorted by weights # H $!d s/.*// x s/\n// h # make a set of vertices s/^\(\w* \w* \).*$/\1/gm :Z s/\b\(\w\+\)\b\(.*\b\1\b\)/\2/ tZ s/[ \n]\+/ /g s/^ *\| *$//g x # here, # hold space: vertices # pattern space: edges # rearrange to weight,v_in,v_out s/^\(\w*\) \(\w*\) \(.*\)$/\3,\1,\2/gm #=================== start of merge sort by key ============== # input: key1,value11,value12,...value1n\nkey2,value21,value22,...value2n\n... # output: sorted items # # a key is a number (i.e., key = [0-9]\+) # a value does not contain symbols (i.e., value = \w*) # s/$/ /gm :a /\n/!bq # each line is sorted: key1,value11,value12,...,value1n key2,value21,... # consequtive items are separated by a space # each item has a key and values # key is on the head, followed by values s/^\(.*\)\n\(.*\)$/@\1: %\2/gm # each line: @item11 item12 ... item1n : %item21 item22 ... item2m # ':' is a separator of two lists # '@' is the head of non-merged part of the first list # '%' is the head of non-merged part of the second list :b # each line: sortedpart @item1i ... item1n : %item2j ... item2m # already done s/^[^%@:\n]*$/~&/gm # done (2nd list is consumed completely) s/^\(.*\)@\(.*\): % *$/~\1\2/gm # done (1st list is consumed completely) s/^\(.*\)@: \(.*\)%\(.*\)/~\1\2\3/gm /@/!be x s/$/\n==mergesort_by_num_key==/ x H # ignore merged lists s/^~.*$/~/gm # prepare to comparation s/^.*@\(\w*\)\(,\w*\)*.*%\(\w*\)\(,\w*\)*.*$/\1 \3/gm #================= start of num_comparator ================ # # num11 num12\nnum21 num22\n...\n~\n...\nnumn1 numn2 # # each line is replaced with a relation # numk1 < numk2 -> < # numk1 > numk2 -> > # numk1 = numk2 -> = # a line of '~' is ignored # preserve the hold space x s/$/\n==num_comparator==/ x H # comparation by lengths :1 s/^ $/=/gm s/^ .*/</gm s/.* $/>/gm s/^[0-9]\|\b[0-9]//gm t1 s/\n//g G s/\n.*==num_comparator==// x s/\n==num_comparator==.*// x :2 s/\`=\(.*\)\n\(.*\)$/\1#\2/m s/\`<\(.*\)\n\(.*\)$/\1#</m s/\`>\(.*\)\n\(.*\)$/\1#>/m s/\`~\(.*\)\n.*$/\1#~/m t2 # comparation by values s/#/\n/g s/\n// s/^/#/gm :3 s/#\(.\)\(.*\) \(.\)\(.*\)/\1\3 #\2 \4/gm t3 s/#//g :4 s/00/=/g s/0[1-9]/</g s/[1-9]0/>/g y/123456789/012345678/ /[0-9]/b4 s/ //gm s/=*\(.\).*/\1/gm #================= end of num_comparator ================ s/\n//g G s/\n.*==mergesort_by_num_key==/#/ x s/\n==mergesort_by_num_key==.*// x # [<>=~]*#\nmergedpart @rest_of_list1 %rest_of_list2\n... # according to the results of comparations held in the head, # move a head of either the rest of list1 or list2 to the tail of merged_part. :m s/^[<=]\(.*\)\n\(.*\)@\(\(\w\|,\)*\) \(.*\)%\(\(\w\|,\)*\)/\1-\2\3 @\5%\6/m s/^>\(.*\)\n\(.*\)@\(\(\w\|,\)*\)\(.*\)%\(\(\w\|,\)*\) /\1-\2\6 @\3\5%/m s/^~\([^\n]*\)\n~/\1-/ tm s/-/\n/g s/#\n// bb :e s/~//g ba :q #=================== end of merge sort by key ============== s/ $// s/ /\n/g # restore the order s/^\(.*\),\(\w*\),\(\w*\)$/\2 \3 \1/gm # pattern space: sorted edges # hold space: vertices s/^/F /gm s/$/%/ G s/%./\n% / # the first vertex s/ \w*$// #======== start of main loop :Y s/#// /%$/bX s/F/#F/ :V # marked with # is the current # used v1, unused v2 /#F \(\w\+\)\b.*%.* \1\b/!{/#F\( \w* \(\w\+\)\b.*%.*\) \2\b/{s//#T\1/ bY} bW} # used v2 unused v1 /#F \w* \(\w\+\)\b.*%.* \1\b/!{/#F\( \(\w\+\)\b.*%.*\) \2\b/{s//#T\1/ bY}} :W /#\(.[^F]*\)F/!bY s//\1#F/ bV #======== end of main loop :X s/.%//
こいつのコメントなどの無駄を省くと 1350B強.
ということで,Prim のほうが短く出来た.union/find なんか使わずに単純な処理で追加するエッジが選べるので短い.
が,しかし,上のPrimの実装はKruskalに比べて2倍近く遅い.ソート部分が同じなので,エッジの選択だけだと3~4倍は遅い.おかげで timeout をくらい提出できず.まだまだ改良の余地があるけれど,打数が伸びそうなので放っておくことにする.
- Comments: 0
- TrackBack (Close): -
sedでmerge sortをしてみる
- 2008-02-15 (Fri)
- プログラミング ( sed/wake/awk )
入力は空白区切りの小文字のみで構成される文字列のリスト.出力はソート結果.あなごるのWord frequency count (FIXED) 用に作ったのだが… 問題の入力に出てくる単語のソートだけで手元のマシンでも6秒弱かかる.
# input: word1 word2 ... wordn # should be lower-case letters. no spaces except for the separator s/ /\n/g s/$/ /gm # each line is a sorted list :a /\n/!q s/^\(.*\)\n\(.*\)$/@\1: %\2/gm # each line: @word11 word12 ... word1n : %word21 word22 ... word2m # ':' is a separator of two lists # '@' is the head of non-merged part of the first list # '%' is the head of non-merged part of the second list :b # each line: sortedpart @word1i ... word1n : %word2j ... word2m s/^[^%@:\n]*$/~&/gm # already done s/^\(.*\)@\(.*\): % *$/~\1\2/gm # done (2nd list is consumed completely) s/^\(.*\)@: \(.*\)%\(.*\)/~\1\2\3/gm # done (1st list is consumed completely) /@/!be h s/^~.*$/~/gm # ignore merged lists s/^.*@\(\w*\).*%\(\w*\).*$/\1 \2/gm # prepare to comparation #------------------- parallel comparation ----- # input: word11 word12\nword21 word22\n...\nwordn1 wordn2 # should be lower-case letters. no spaces except for the separator # the entry '~' is ignored # algorithm: # make pairs of corresponding characters in two words # compare all pairs at the same time # take the first non-equal result of he comparisons s/^\| /&#/gm s/^#~ *$/~/gm :5 s/#\(\w\)\(.*\)#\(\w\)/\1\3 #\2#/mg s/#\w.*#$/>/mg s/# #..*/</mg s/# #$/=/mg t5 :1 :2 /a/b9 y/bcdefghijklmnopqrstuvwxyz/abcdefghijklmnopqrstuvwxy/ b2 :9 s/a[^a] /</g s/[^a]a />/g s/aa /=/g /[^\n=<>~]/b1 s/^=*\(.\)/\1/gm s/^\(.\).*/\1/gm #----------- parallel comparation done ------ s/\n//g s/$/#/ G # [<>=]*#\nmergedpart @rest_of_list1 %rest_of_list2\n... # according to the results of comparations held in the head, # move a head of either the rest of list1 or list2 to the tail of merged_part. :m s/^[<=]\(.*\)\n\(.*\)@\(\w*\) \(.*\)%\(\w*\)/\1-\2\3 @\4%\5/m s/^>\(.*\)\n\(.*\)@\(\w*\)\(.*\)%\(\w*\) /\1-\2\5 @\3\4%/m s/^~\([^\n]*\)\n~/\1-/ tm s/-/\n/g s/#\n// bb :e s/~//g ba
一文字目でバケットソートしてマージソートを各バケットに適用とかしないとダメっぽい.とにかく操作文字列長を小さくしないと速度が…
- Comments: 0
- TrackBack (Close): -