2008年03月
topological sort @sed
- 2008-03-23 (Sun)
- プログラミング ( sed/wake/awk )
書いてから気づいた.辞書順って出力文字列の辞書順か.点の番号での辞書順じゃないのね.最初に辞書順に並び替える部分書き加えればいいのだろうけど面倒なのでやらない方向で.どちらにせよ,最後の入力は時間オーバーだし.
G h $!d g s/^/\n/ : h s/\(.*\n\)\(\w*\):\n\(.*\)/\2/p G s//\1\3/ :1 s/^\(\(\w\+\).*\) \2\b/\1/ t1 s/.*\n//m /:/b d
固定文字列に対してはマッチした部分の消去が楽に出来るけど,動的な文字列にマッチした部分の消去ってのはまだうまくかけない.これの速度を上げられないといろいろな問題で困る.どうしたもんかなぁ.
それはさておき研究室に行ったのにネットワークが使えないってのは仕事すんなってことかな?
- Comments: 0
- TrackBack (Close): -
MST by Kruskal & Prim @sed
- 2008-03-21 (Fri)
- プログラミング ( sed/wake/awk )
追いコンで「Primの方が単純ではないか」という指摘をもらったので試しに sed でも prim を実装することにした.
とりあえずオリジナルの Kruskal バージョン.大半がソートに取られている.もっと簡単で速いソートがほしい.
# input:
# a set of edges
# each line: v_in v_out weight
#
# output:
# a set of edges marked with use-flag
# eachline: useflag v_in v_out weight
#
# useflag = T if the edge is used
# F if the edge is not used
#
# the order of edges in the output is not the same as the input
# edges are sorted by weights
#
H
$!d
s/.*//
x
s/\n//
h
# make a set of vertices
s/^\(\w* \w* \).*$/\1/gm
:Z
s/\b\(\w\+\)\b\(.*\b\1\b\)/\2/
tZ
s/[ \n]\+/\n/g
s/^\n*\|\n*$//g
x
# here,
# hold space: vertices
# pattern space: edges
# rearrange to weight,v_in,v_out
s/^\(\w*\) \(\w*\) \(.*\)$/\3,\1,\2/gm
#=================== start of merge sort by key ==============
# input: key1,value11,value12,...value1n\nkey2,value21,value22,...value2n\n...
# output: sorted items
#
# a key is a number (i.e., key = [0-9]\+)
# a value does not contain symbols (i.e., value = \w*)
#
s/$/ /gm
:a
/\n/!bq
# each line is sorted: key1,value11,value12,...,value1n key2,value21,...
# consequtive items are separated by a space
# each item has a key and values
# key is on the head, followed by values
s/^\(.*\)\n\(.*\)$/@\1: %\2/gm
# each line: @item11 item12 ... item1n : %item21 item22 ... item2m
# ':' is a separator of two lists
# '@' is the head of non-merged part of the first list
# '%' is the head of non-merged part of the second list
:b
# each line: sortedpart @item1i ... item1n : %item2j ... item2m
# already done
s/^[^%@:\n]*$/~&/gm
# done (2nd list is consumed completely)
s/^\(.*\)@\(.*\): % *$/~\1\2/gm
# done (1st list is consumed completely)
s/^\(.*\)@: \(.*\)%\(.*\)/~\1\2\3/gm
/@/!be
x
s/$/\n==mergesort_by_num_key==/
x
H
# ignore merged lists
s/^~.*$/~/gm
# prepare to comparation
s/^.*@\(\w*\)\(,\w*\)*.*%\(\w*\)\(,\w*\)*.*$/\1 \3/gm
#================= start of num_comparator ================
#
# num11 num12\nnum21 num22\n...\n~\n...\nnumn1 numn2
#
# each line is replaced with a relation
# numk1 < numk2 -> <
# numk1 > numk2 -> >
# numk1 = numk2 -> =
# a line of '~' is ignored
# preserve the hold space
x
s/$/\n==num_comparator==/
x
H
# comparation by lengths
:1
s/^ $/=/gm
s/^ .*/</gm
s/.* $/>/gm
s/^[0-9]\|\b[0-9]//gm
t1
s/\n//g
G
s/\n.*==num_comparator==//
x
s/\n==num_comparator==.*//
x
:2
s/\`=\(.*\)\n\(.*\)$/\1#\2/m
s/\`<\(.*\)\n\(.*\)$/\1#</m
s/\`>\(.*\)\n\(.*\)$/\1#>/m
s/\`~\(.*\)\n.*$/\1#~/m
t2
# comparation by values
s/#/\n/g
s/\n//
s/^/#/gm
:3
s/#\(.\)\(.*\) \(.\)\(.*\)/\1\3 #\2 \4/gm
t3
s/#//g
:4
s/00/=/g
s/0[1-9]/</g
s/[1-9]0/>/g
y/123456789/012345678/
/[0-9]/b4
s/ //gm
s/=*\(.\).*/\1/gm
#================= end of num_comparator ================
s/\n//g
G
s/\n.*==mergesort_by_num_key==/#/
x
s/\n==mergesort_by_num_key==.*//
x
# [<>=~]*#\nmergedpart @rest_of_list1 %rest_of_list2\n...
# according to the results of comparations held in the head,
# move a head of either the rest of list1 or list2 to the tail of merged_part.
:m
s/^[<=]\(.*\)\n\(.*\)@\(\(\w\|,\)*\) \(.*\)%\(\(\w\|,\)*\)/\1-\2\3 @\5%\6/m
s/^>\(.*\)\n\(.*\)@\(\(\w\|,\)*\)\(.*\)%\(\(\w\|,\)*\) /\1-\2\6 @\3\5%/m
s/^~\([^\n]*\)\n~/\1-/
tm
s/-/\n/g
s/#\n//
bb
:e
s/~//g
ba
:q
#=================== end of merge sort by key ==============
s/ $//
s/ /\n/g
# restore the order
s/^\(.*\),\(\w*\),\(\w*\)$/\2 \3 \1/gm
x
# hold space: sorted edges
# pattern space: vertices
# make singleton sets
s/^.*$/& &/gm
x
s/^/#/
s/$/\n/
x
# main loop
#
# use marked edge ('#' on the head) to union sets
:Y
# finished?
x
/#$/bX
x
s/$/\n==end_of_vertices==/
G
s/==end_of_vertices==.*#\(\w* \w*\) .*/\1/
s/\(.*\)\n\(.*\)/\2\n\1\n/
#============== start of unionfind ========================
#
# input: namex namey\nname1 parent1\nname2 parent2\n...\n
#
# output: root flag\nname1 parent1\n...\nnamex root\n...\nnamey root\n...
# where root is the representive name of the set
# parents of namex and namey are set to root
# flag is F if namex and namey have already been in the same set
# T otherwise (i.e., sets are unioned)
#
# find the root of the first element
:A
s/^\(\w\+\) \(.*\n\1 \(\w\+\)\b\)\([^#]\)/\3 \2#\4/
tA
# find the root of the second element
:B
s/^\(\w\+\) \(\w\+\)\b\(.*\n\2 \(\w\+\)\b\)\([^#]\)/\1 \4\3#\5/
tB
/^\(\w*\) \1\b/{s//\1 F/;bD}
/^\w\+ \(\w\+\)\b.*\n\1 \1#/{s/^\(\w*\) \w*\b/\1 T/;bD}
s/^\(\w*\) \w*\b/\1 F/
:D
# refine the parent-relation
:E
s/^\(\w*\)\( .* \)\w*#/\1\2\1/
tE
# add flags
s/\n$//
#============== end of unionfind ========================
# marking the edge
x
s/$/\n==end_of_edges==/
G
s/#\([^\n]*\n\)\(.*\)\n==end_of_edges==\n\w* \(.\).*/\3 \1#\2/
x
# remove return value of unonfind
s/.*\n//m
bY
:X
s///
s/\n$//
これのコメントをなくしてホールドスペースの保護用のマークを小さくすれば 1600B 弱になる.
次に,Kruskal のエッジソート後の処理を変えて作った Prim の実装.
# input:
# a set of edges
# each line: v_in v_out weight
#
# output:
# a set of edges marked with use-flag
# eachline: useflag v_in v_out weight
#
# useflag = T if the edge is used
# F if the edge is not used
#
# the order of edges in the output is not the same as the input
# edges are sorted by weights
#
H
$!d
s/.*//
x
s/\n//
h
# make a set of vertices
s/^\(\w* \w* \).*$/\1/gm
:Z
s/\b\(\w\+\)\b\(.*\b\1\b\)/\2/
tZ
s/[ \n]\+/ /g
s/^ *\| *$//g
x
# here,
# hold space: vertices
# pattern space: edges
# rearrange to weight,v_in,v_out
s/^\(\w*\) \(\w*\) \(.*\)$/\3,\1,\2/gm
#=================== start of merge sort by key ==============
# input: key1,value11,value12,...value1n\nkey2,value21,value22,...value2n\n...
# output: sorted items
#
# a key is a number (i.e., key = [0-9]\+)
# a value does not contain symbols (i.e., value = \w*)
#
s/$/ /gm
:a
/\n/!bq
# each line is sorted: key1,value11,value12,...,value1n key2,value21,...
# consequtive items are separated by a space
# each item has a key and values
# key is on the head, followed by values
s/^\(.*\)\n\(.*\)$/@\1: %\2/gm
# each line: @item11 item12 ... item1n : %item21 item22 ... item2m
# ':' is a separator of two lists
# '@' is the head of non-merged part of the first list
# '%' is the head of non-merged part of the second list
:b
# each line: sortedpart @item1i ... item1n : %item2j ... item2m
# already done
s/^[^%@:\n]*$/~&/gm
# done (2nd list is consumed completely)
s/^\(.*\)@\(.*\): % *$/~\1\2/gm
# done (1st list is consumed completely)
s/^\(.*\)@: \(.*\)%\(.*\)/~\1\2\3/gm
/@/!be
x
s/$/\n==mergesort_by_num_key==/
x
H
# ignore merged lists
s/^~.*$/~/gm
# prepare to comparation
s/^.*@\(\w*\)\(,\w*\)*.*%\(\w*\)\(,\w*\)*.*$/\1 \3/gm
#================= start of num_comparator ================
#
# num11 num12\nnum21 num22\n...\n~\n...\nnumn1 numn2
#
# each line is replaced with a relation
# numk1 < numk2 -> <
# numk1 > numk2 -> >
# numk1 = numk2 -> =
# a line of '~' is ignored
# preserve the hold space
x
s/$/\n==num_comparator==/
x
H
# comparation by lengths
:1
s/^ $/=/gm
s/^ .*/</gm
s/.* $/>/gm
s/^[0-9]\|\b[0-9]//gm
t1
s/\n//g
G
s/\n.*==num_comparator==//
x
s/\n==num_comparator==.*//
x
:2
s/\`=\(.*\)\n\(.*\)$/\1#\2/m
s/\`<\(.*\)\n\(.*\)$/\1#</m
s/\`>\(.*\)\n\(.*\)$/\1#>/m
s/\`~\(.*\)\n.*$/\1#~/m
t2
# comparation by values
s/#/\n/g
s/\n//
s/^/#/gm
:3
s/#\(.\)\(.*\) \(.\)\(.*\)/\1\3 #\2 \4/gm
t3
s/#//g
:4
s/00/=/g
s/0[1-9]/</g
s/[1-9]0/>/g
y/123456789/012345678/
/[0-9]/b4
s/ //gm
s/=*\(.\).*/\1/gm
#================= end of num_comparator ================
s/\n//g
G
s/\n.*==mergesort_by_num_key==/#/
x
s/\n==mergesort_by_num_key==.*//
x
# [<>=~]*#\nmergedpart @rest_of_list1 %rest_of_list2\n...
# according to the results of comparations held in the head,
# move a head of either the rest of list1 or list2 to the tail of merged_part.
:m
s/^[<=]\(.*\)\n\(.*\)@\(\(\w\|,\)*\) \(.*\)%\(\(\w\|,\)*\)/\1-\2\3 @\5%\6/m
s/^>\(.*\)\n\(.*\)@\(\(\w\|,\)*\)\(.*\)%\(\(\w\|,\)*\) /\1-\2\6 @\3\5%/m
s/^~\([^\n]*\)\n~/\1-/
tm
s/-/\n/g
s/#\n//
bb
:e
s/~//g
ba
:q
#=================== end of merge sort by key ==============
s/ $//
s/ /\n/g
# restore the order
s/^\(.*\),\(\w*\),\(\w*\)$/\2 \3 \1/gm
# pattern space: sorted edges
# hold space: vertices
s/^/F /gm
s/$/%/
G
s/%./\n% /
# the first vertex
s/ \w*$//
#======== start of main loop
:Y
s/#//
/%$/bX
s/F/#F/
:V
# marked with # is the current
# used v1, unused v2
/#F \(\w\+\)\b.*%.* \1\b/!{/#F\( \w* \(\w\+\)\b.*%.*\) \2\b/{s//#T\1/
bY}
bW}
# used v2 unused v1
/#F \w* \(\w\+\)\b.*%.* \1\b/!{/#F\( \(\w\+\)\b.*%.*\) \2\b/{s//#T\1/
bY}}
:W
/#\(.[^F]*\)F/!bY
s//\1#F/
bV
#======== end of main loop
:X
s/.%//
こいつのコメントなどの無駄を省くと 1350B強.
ということで,Prim のほうが短く出来た.union/find なんか使わずに単純な処理で追加するエッジが選べるので短い.
が,しかし,上のPrimの実装はKruskalに比べて2倍近く遅い.ソート部分が同じなので,エッジの選択だけだと3~4倍は遅い.おかげで timeout をくらい提出できず.まだまだ改良の余地があるけれど,打数が伸びそうなので放っておくことにする.
- Comments: 0
- TrackBack (Close): -
Haskell で並列計算を試す
Control.Parallel の par を使うと第一引数を別スレッドで評価してくれる(かも).ということで,試した.
import Control.Parallel
homP op f x = h x
where
h [a] = f a
h (a:x) = fa `par` (hx `seq` (fa `op` hx))
where fa = f a
hx = h x
mapP f = homP (++) (\x->[f x])
chunk n [] = []
chunk n x = take n x:chunk n (drop n x)
split p x = chunk (div (length x + p - 1) p) x
mapS f dxs = mapP (map f) dxs
reduceS op dxs = homP op (foldl1 op) dxs
main = print $ reduceS(+) $ mapS f $ split 32 [1..10000]
f n = fib 16
fib 0 = 1
fib 1 = 1
fib n = fib (n - 1) + fib (n - 2)
基本,split でチャンクに切り分けて,homP で各チャンクに対する計算を並列評価する.
par の型は a -> b -> b で,デフォルトだと最初の引数を完全に無視する.んで,コンパイルというかリンク時に -threaded をつけてやると別スレッドでの並列評価を試みるコードになってくれるらしい.あと,実行時に +RST -N2 とかやって, N オプションで生成するネイティブスレッドの数を指定する.
とりあえず上のコードを デュアルコアで動かしたら 2.5 秒が 1.5 秒になる程度の効果が現れた.確かに並列で評価してくれているらしい.
ちなみに,
main = print $ sum $ concat $ mapS f $ split 32 [1..10000]
としてあげると並列計算の効果が全く現れない.sum の計算が concat の結果を頭から消費していく形なので,前のチャンクの結果が sum で消費しつくされてからしか次のチャンクの評価が起こらない.なので,mapS f での各チャンクに対する map f の評価が逐次的にしか起きてくれず,並列計算にならない.と思う.とりあえず遅延評価があると思ったより面倒だなぁ.
- Comments: 0
- TrackBack (Close): -