Summary
md5sumコマンドを用いてフォルダ内のファイルのmd5sumを自動で計算し、階層ごとに_checksumフォルダに入れるshell
scriptを作成した。
_checksumフォルダがなければ自動生成する。フォルダのみが含まれるフォルダや、隠しファイルにはmd5sumを生成しない。_checksumフォルダ内のファイルは無視する。
Macにはmd5sumコマンドは含まれていないためbrewでmd5sha1sumをインストールする必要がある。
$ brew update
$ brew upgrade
$ brew install md5sha1sum
使い方
% md5sum --version
Microbrew md5sum/sha1sum/ripemd160sum 0.9.5 (Wed Dec 6 12:48:56 EST 2006)
Compiled Oct 23 2021 at 02:12:08
Written by Bulent Yilmaz
Copyright (C) 2004,2006 Microbrew Software
%
% md5sum --help
Usage: md5sum [<option>] <file> [<file> [...] ]
md5sum [<option>] --check <file>
Note: These options are mostly compatible with GNU md5sum
-s, -h, and -V are not available in GNU md5sum
-b, --binary Read files in binary mode
-c, --check <file> Check MD5 sums from <file>
-t, --text Read files in ASCII mode
-s, --status Silent mode: Use exit code to determine verification
-h, --help Display this help message and exit
-V, --version Display program version and exit
Macではかわりにmd5コマンドが用いられるらしい。 使い方は”man md5”で確認可能
% man md5
MD5(1) General Commands Manual MD5(1)
NAME
md5 – calculate a message-digest fingerprint (checksum) for a file
SYNOPSIS
md5 [-pqrtx] [-s string] [file ...]
DESCRIPTION
The md5 utility takes as input a message of arbitrary length and produces
as output a “fingerprint” or “message digest” of the input. It is
conjectured that it is computationally infeasible to produce two messages
having the same message digest, or to produce any message having a given
prespecified target message digest. The MD5 algorithm is intended for
digital signature applications, where a large file must be “compressed” in
a secure manner before being encrypted with a private (secret) key under a
public-key cryptosystem such as RSA.
MD5's designer Ron Rivest has stated "md5 and sha1 are both clearly broken
(in terms of collision-resistance)". So MD5 should be avoided when
creating new protocols, or implementing protocols with better options.
SHA256 and SHA512 are better options as they have been more resilient to
attacks (as of 2009).
The following options may be used in any combination and must precede any
files named on the command line. The hexadecimal checksum of each file
listed on the command line is printed after the options are processed.
-s string
Print a checksum of the given string.
-p Echo stdin to stdout and append the checksum to stdout.
-q Quiet mode - only the checksum is printed out. Overrides the -r
option.
-r Reverses the format of the output. This helps with visual diffs.
Does nothing when combined with the -ptx options.
-t Run a built-in time trial.
-x Run a built-in test script.
EXIT STATUS
The md5 utility exits 0 on success, and 1 if at least one of the input
files could not be read.
SEE ALSO
cksum(1), CC_SHA256_Init(3), md5(3), ripemd(3), sha(3)
R. Rivest, The MD5 Message-Digest Algorithm, RFC1321.
Vlastimil Klima, Finding MD5 Collisions - a Toy For a Notebook, Cryptology
ePrint Archive: Report 2005/075.
ACKNOWLEDGMENTS
This program is placed in the public domain for free general use by RSA
Data Security.
macOS 12.1 June 6, 2004 macOS 12.1
参考:
chmodでgenerate_md5sum.shを実行可能に変更し、 md5sumを計算したいファイルの入ったフォルダを指定してスクリプトを実行 md5sumコマンドが存在しない場合はインストールしておくこと
% chmod a+x ./generate_md5sum.sh
% ./generate_md5sum.sh ./target_dir
工夫点と課題
IFS=$'n'
として改行コードを変更する(終了時に元に戻すのを忘れずに)$
を使いたい時shell
script中ではどうやるかわからなかった
"$"
のようにダブルクオートで括る\
だとフォルダ区切りをエスケープしないといけなくてややこしいので別の当たり障りのない文字に変えるsed "s_aaa_bbb_g"
とすると_
に変更できる# ============================================
# Calculate md5sum in the target folder
#
# generate_md5sum.sh
# Coded by Noboru Harada (noboru@ieee.org)
#
# Changes:
# 2022/10/09 The first version
#
# Usage:
# > generate_md5sum.sh ./target_dir
#
# tested on Mac
# ============================================
if [ $# -lt 1 ]; then
echo "USAGE: generate_md5sum.sh dir_path"
exit 1
fi
dir_path=$1
#echo "$dir_path"
dirs=`find $dir_path -maxdepth 5 -type d`
if [ -z "$dirs" ]; then
dirs="$1"
fi
#echo "$dirs"
# change IFS for filenames with white spaces
IFS_BACK="$IFS"
IFS=$'\n'
# dig the target dir
for dir in $dirs;
do
echo "Processing DIR: $dir"
dir_checksum=`echo "$dir/_checksum" | sed -e "s#//#/#g"`
files=`find $dir -maxdepth 1 -type f -name "*.*" -exec echo {} \;`
files_strip=`echo "$files" | sed -e "s#//#/#g"`
if [ -d $dir_checksum ]; then
echo " $dir_checksum exits."
echo " Skip (case1): $dir_checksum"
else
# ignore _checksum folder for searching target
dir_checksum_strip=`echo "$dir_checksum" | sed -e "s#.*_checksum/_checksum"$"##g"`
if [ -z $dir_checksum_strip ]; then
echo " Skip (case2): $dir_checksum"
files=""
files_strip=""
else
if [ -z $files_strip ]; then
echo " Only folder exists in $dir"
echo " Skip (case3): $dir_checksum"
else
echo "$dir_checksum does not exist."
echo "mkdir $dir_checksum"
mkdir "$dir_checksum"
fi
fi
fi
for file in $files_strip;
do
# strip dir names (remove the very last '/' and previous characters)
file_strip1=`echo "$file" | sed -e "s#^.*/##g"`
# ignore invisible files (starting with .) and *.md5sum files
file_strip2=`echo "$file_strip1" | sed -e "s#^\..*##g" | sed -e "s!.*md5sum"$"!!g"`
echo "$file_strip2"
if [ -z $file_strip2 ]; then
echo " Skip (case4): Don't process DIR, HIDEEN file or .md5sum: $file"
else
echo " Processing: $file"
if [ -e $file ]; then
sum_file=`echo "$dir/_checksum/$file_strip1.md5sum" | sed -e "s#//#/#g"`
echo " md5sum \"$file\" > \"$sum_file\""
md5sum "$file" | sed -e "s#$dir/##g" > "$sum_file"
fi
fi
done
done
IFS="$IFS_BACK"
## end
だいぶ忘れてたので少し苦労した