Best way to simulate multidimensional arrays "objects" in BASH?

Looks like I’ve arrived really late in this discussion! It’s true Bash has its uses, no doubt. I understand the sentiment of moving quickly to a scripting language instead too. I’d probably opt for Perl if I had to, not knowing much Python (yet).

The old joke about Perl being a write-only language still stands, I think :wink:

1 Like

Nice work! Brings back memories of working with Perl. I loved the Perl Camel Book – read that thing word for word with many underlines and notes (Still have it). Circa 1996

2 Likes

I remember reading “Learning Perl” on the train for first time coming home from the bookstore and I couldn’t stop laughing!

Progress 11: POSIX or bust

This is a proof of concept for zero knowledge traversal of a BAAML dataset declaring the datatypes of each property with it’s address.

The code below is the initial foundation for replacing perl and RegEx.

#!/usr/bin/env sh

MyData=`cat <<\EOF
	SpaceX
		headquarters
			address:Rocket Road
			city:Hawthorne
			state:California
		links
			website:https://www.spacex.com/
			flickr:https://www.flickr.com/photos/spacex/
			twitter:https://twitter.com/SpaceX
			elon_twitter:https://twitter.com/elonmusk
		name:SpaceX
		founder:Elon Musk
		founded:2002
		employees:8000
EOF`

HuntMode="TabCount"; Level=0; PropName=""

while IFS= read -rn1 Char; do
	if [ "$HuntMode" = "TabSearch" ]; then
		# Look for tabs
		if [ "$Char" = "	" ]; then
			HuntMode="TabCount"
		fi
	fi
	if [ "$HuntMode" = "TabCount" ]; then
		# Count tabs
		if [ "$Char" = "	" ]; then
			(( Level = Level + 1 ))
		else
			# Property name found
			# The property name array needs to be ready to have the next property name appended to it. It needs to be trimmed if the hierarchical $Level is the same or less than the number of property names present.
			if [ "$(( $# - $Level))" -ge 0 ]; then
				for i in $(seq 1 $#); do
					[ "$i" -lt $Level ] && set -- "$@" "$1"
					shift
				done
			fi
			HuntMode="PropertyName"
		fi
	fi
	if [ "$HuntMode" = "PropertyName" ]; then
		# Gather property name
		if [ "$Char" = ":" ] || [ "$Char" = "" ]; then
			# Property name gathered, add to array
			set -- "$@" "${PropName}"
			# Reset state back to tab search skipping value
			HuntMode="TabSearch"; Level=0; PropName=""
			# Declare the property name and type for the demo
			if [ "$Char" = ":" ]; then # Property is a value pair
				echo "\"$@\" is a value pair"
			elif [ "$Char" = "" ]; then # Property is a an object
				echo "\"$@\" is an object"
			fi
		else
			# Read property name
			PropName="${PropName}${Char}"
		fi
	fi
done <<< $MyData

Output:

“SpaceX” is an object
“SpaceX headquarters” is an object
“SpaceX headquarters address” is a value pair
“SpaceX headquarters city” is a value pair
“SpaceX headquarters state” is a value pair
“SpaceX links” is an object
“SpaceX links website” is a value pair
“SpaceX links flickr” is a value pair
“SpaceX links twitter” is a value pair
“SpaceX links elon_twitter” is a value pair
“SpaceX name” is a value pair
“SpaceX founder” is a value pair
“SpaceX founded” is a value pair
“SpaceX employees” is a value pair

Takeaway:

POSIX Shell doesn’t have arrays but it can use arguments like an array.

I needed something to keep track of property names as the script moved through the hierarchy and this fit the bill. shift makes it easy to remove arguments from the beginning but the tricky part was removing arguments from the end.

I came up with this:

set Setting 5 new array "elements here"

# Remove 2 arguments from the end
X=2
for i in $(seq 1 $#); do
	[ "$i" -le $(($# - $X)) ] && set -- "$@" "$1"
	shift
done

echo $@

Output:

Setting 5 new

  • It loops as many times as there are arguments.
  • Each loop takes the 1st argument and copies it to the end, then deletes the 1st argument.
  • Unless… it’s within $X of the argument total in which case it doesn’t copy the 1st argument, it just deletes it.
  • When the loop ends, all arguments are back in their original order and $X arguments are gone from the end.

Cheat sheet for more commands here:

Is that the only way to remove a value from the end? Seems…inefficient…

1 Like
set A B C D
for i in $(seq 1 $#); do
	[ "$i" -le $(($# - 2)) ] && set -- "$@" "$1"
	shift
done
echo $@

Output:

A B

I’d die for a better way that’s still POSIX lol.

It doesn’t need to be an array, I just need a way to append a non-spaced word to the end of a String and be able to remove X words from the end without using more resources than that strategy.

Imagining this data set:

MyData=`cat <<\EOF
	SpaceX
		headquarters
			address:Rocket Road
	NASA
		headquarters
			address:300 Hidden Figures Way SW
EOF`

…the String would need to contain these values in order:

SpaceX
SpaceX headquarters
SpaceX headquarters address
NASA
NASA headquarters
NASA headquarters address

All it knows is the word that needs to be appended -or- how many words need to be removed.

The objective is to have a String that’ll match the one given to the function when the correct address is found.

I was letting my mind wander and I thought up a rev solution. This actually works…

The argument order is reversed so shift is removing arguments that were formerly on the end. Then the order is reversed back.

set A B C D
set -- $(rev <<< "$@")
shift 2
set -- $(rev <<< "$@")
echo $@

A B

I did some time tests and they seem to take the same amount of time though.

That looks easier to read. Actually, I had misunderstood what was actually going on in the first method. I thought it was using a lot of memory but I see the array is only size+1 until it finishes.

Progress 12: Put it on my tab

When a BAAML object is added, removed or changed, it’s hierarchical tabbing needs to be changed to suit it’s destination.

For example if I want to move object "Warehouse Rice" to "Store Shelf_1", the Rice object and it’s properties have 2 base tabs. These need to be increased to 3 to put them in "Shelf_1", otherwise Rice will just be a propery of Store.

base tabs

The original BAAMX way:

BAAMX accomplished this using a whole bunch of Perl…

BAAMX_SET_BASE (){ # Engine: Change the base tab count of every line in a String
	VAL=$1; SIZE=$2; BASE_COUNT=$3 # SIZE: 0="	", 1="		", ect
	BASE_NEW=$(printf '\t%.0s' $(seq 1 $SIZE))

 	# If the value is an object it may need heirachal delimination adjustment.
	if [[ ${VAL:0:1} == "	" ]]; then # VAL is an Object

		# If the base count of VAL isn't provided, get the base count
		if [ -z "$BASE_COUNT" ]; then BASE_COUNT=$(perl -e "'$VAL' =~ m/^\t+/s; print $&;" | wc -m); fi

		# Adjust the value if the length of heirarchal delimination isn't the same as the property it's being added to.
		if [ "$BASE_COUNT" -ne "$SIZE" ]; then VAL=$(perl -p -e "s/(^|[^\t])\K\t{$BASE_COUNT}/$BASE_NEW/g" <<< $VAL); fi

		printf '%s' "$VAL"

	else  # VAL is an simple value
		printf '%s' "$BASE_NEW$VAL"
	fi
}

The new BAAM way:

Not only is this POSIX and not only is this radically faster but it’s a drop-in replacement for BAAMX’s BAAMX_SET_BASE so both projects get an upgrade.

BAAM_SetBase (){
	Val=$1; Size=$2; BaseCount=$3
	# If the base count of Val isn't provided, get the base count.
	if [ -z "$BaseCount" ]; then
		while IFS= read -rn1 Char; do
			if [ "$Char" = "	" ]; then (( BaseCount = BaseCount + 1 ))
			else break; fi
		done <<< $Val
	fi

	# Adjust the value if the length of hierarchical delimination isn't the same as the property it's being added to.
	if [ "$BaseCount" -gt "$Size" ]; then
		Val=$(cut -c $(( BaseCount - Size + 1 ))-  <<< $Val) # Hangs the system
	elif [ "$BaseCount" -lt "$Size" ]; then
		TabsToAdd=$(printf '\t%.0s' $(seq 1 $(( Size - BaseCount ))))
		Val=$(sed "s/^/$TabsToAdd/" <<< $Val)
	fi

	printf '%s' "$Val"
}

In action

setbase

A takeaway

# Repeat the contents of printf X times:
printf 'Hey%.0s' $(seq 1 4)

HeyHeyHeyHeyHey

# Same result moving %.0s
printf '%.0sHey' $(seq 1 4)

HeyHeyHeyHeyHey

1 Like

Progress 13: What’s in your bug out bag?

Most of the complicated bits of BAAMX are RegEx so regexr.com really made life easy with debugging. Not so much with BAAM as the magic now happens inside sh.

I’m in the prototyping stages but to really nail things down I had to create a debugger that’ll go step by step to make life considerably easier when a test fails.

Below is a visual representation of BAAM running the following command:

BAAM "$MyData" GetVal SpaceX links website

BAAM_debugger

Future iterations of BAAM won’t do as much processing on data it knows won’t contain a match but this is BAAM’s present state.

A few thoughts

It’s looking like BAAM is going to be superior to BAAMX in speed, features, function and intuitiveness but at the cost of a larger code base (though not by much).

BAAM can return useful errors unlike BAAMX but doing a good job of returning errors is another thing entirely. I’ll probably release BAAM with minimal error work and then consider if I want to do something elegant.

Things are looking good.

1 Like

There was a recent Terminalforlife video containing a heredoc and it got me curious.

Syntax of a heredoc:

[COMMAND] <<[-] 'DELIMITER'
 HERE-DOCUMENT
DELIMITER

Example heredoc delivering arbitrary text to cat:

lines='rows'
cat << 'EOF'
  some arbitrary
  text on 2 $lines
EOF
# Output:
  some arbitrary
  text on 2 $lines

Here’s the catch…

If the deliminator value (EOF in this case) isn’t in quotes it’ll do variable expansion:

lines='rows'
cat << EOF
  some arbitrary
  text on 2 $lines
EOF
# Output:
  some arbitrary
  text on 2 rows

However… you can prepend a \ to the deliminator instead and it’ll prevent variable expansion like quoting it.

lines='rows'
cat <<\EOF
...

Fascinating and a useful thing to know for expanding variables directly into BAAML datasets should there be the need.

More special cases here: https://linuxize.com/post/bash-heredoc/

heredoc is a great tool especially if you what “text editor like” editing of variable or printable content. It easily allows you to use indents, spacing, quoting, etc without having to fuss with \n or quoting/unquoting or rotating quote styles.

I use it in perl and php but I didn’t know that this was available in bash. Where did you find the documentation for this? as I get:

apropos heredoc
heredoc: nothing appropriate.
apropos "<<<"
<<<: nothing appropriate.

There’s a good run down here: https://linuxize.com/post/bash-heredoc/

They’re definitely beautiful thing. :slight_smile:

1 Like

I’m in a difficult position.

BAAM is complete… but…

It’s not POSIX

sh is just an alias to bash in modern computing but i’ve discovered it’s vastly more permissive on Fedora than it is on Debian for non-POSIX script (perhaps for compatibility gurantees) which definitely made things less intuitive developing this on Fedora.

Shellcheck also doesn’t seem to care if #!/usr/bin/env sh is in the header and short of far deeper research into the syntax it’s just a nightmare insuring POSIX compliance with a script this size.

How I feel about POSIX has changed

While it provides portability guarantees, in practice compared to bash the only benefit is compatibility with Apple software beyond BASH v3.2 and extremely old operating systems.

It’s also a pain in the butt to do complex things and it’s often at a performance cost. If BASH is going to interpret BAAM whether is POSIX or not it’s practically a waste not to use it.

BAAM is big

It’s ~200 lines w/o comments. Ideally i’d like something closer to BAAMX for ease of maintenance and lowering interpretation overhead.

I’ve also been considering a completely different method of parsing BAAML that’s far cleaner, far more elegant and requiring of much less code.

The real goal was to remove dependencies

This was really about removing dependencies for a pure “in-house” solution as BAAMX was more like Perl with a thin BASH wrapper and tons of interpreter overhead. Making it POSIX was taking that to the nth degree which really wasn’t necessary.

What’s the furture of BAAM?

BAAM is now strictly a BASH project. I need to clean it up a bit but the first working version of BAAM will be posted. It’ll then be entirely re-written with a new BAAML parser.

My goal is to settle on a fully native, simple to use, elegant solution to BASH multi-dimensional associative arrays and the 2nd version of BAAM after I post the 1st should be it.

1 Like

Progress 14: Bashing out a prototype

This is the first prototype of a BASH native solution using a pointer that reads character by character. I didn’t want to spend too much time cleaning it up and writing examples as it’ll be undergoing massive changes. There’s also a few uses wc, cut and sed, ect which wont be in the next release.

NEXT RELEASE is the one to look out for.

Full library:

#!/usr/bin/env bash
Err(){
	printf '%s\n' "$2" 1>&2
	[ $1 -gt 0 ] && exit $1
}

Baam(){
	# Handle Arguements
	Db=$1; shift
	Mode=$1; shift
	if [ "$Mode" = "SetVal" ]; then InsertVal=$1; shift; fi
	MatchLevel=$#
	AddrTarget="$@ "

	# Declare vars
	DataPos=-1
	Level=0
	LevelFocus=1
	HuntMode='BaamHuntTabCount'
	NewLine='
'

	while read -rN1 Char; do
		(( DataPos++ ))
		${HuntMode}
		[ "$HuntMode" = 'Done' ] && break
	done <<< $Db
	BaamHuntReturn
}

BaamSetBase(){
	Val=$1; DestLevel=$2; CurLevel=$3
	# If the base count of Val isn't provided, get the base count.
	if [ -z "$CurLevel" ]; then
		Tabs="${Val%%[!	]*}"
		CurLevel=${#Tabs}
	fi

	# Adjust the value if the length of hierarchical delimination isn't the same as the property it's being added to.
	if [ "$CurLevel" -gt "$DestLevel" ]; then
		Val=$(cut -c $(( CurLevel - DestLevel + 1 ))-  <<< $Val)
	elif [ "$CurLevel" -lt "$DestLevel" ]; then
		TabsToAdd=$(printf '\t%.0s' $(seq 1 $(( DestLevel - CurLevel ))))
		Val=$(sed "s/^/$TabsToAdd/" <<< $Val)
	fi

	printf '%s' "$Val"
}

BaamHuntTabSearch(){
	# Look for tabs
	if [ "$Char" = '	' ]; then
		HuntMode='BaamHuntTabCount'
		BaamHuntTabCount
	elif [ -n "$ValBufferOn" ]; then
		ValBuffer="$ValBuffer$Char" # Will include the trailing newline in final result
	fi
}

BaamHuntTabCount(){
	# Count tabs
	if [ "$Char" = '	' ]; then
		(( Level = Level + 1 ))
		[ -n "$ValBufferOn" ] && ValBufferPre="$ValBufferPre$Char"
	else
		# Property name found, initalize property reading
		if [ -n "$AddrFound" ]; then
			if [ -n "$ValBufferOn" ] && [ "$Level" -gt "$LevelFocus" ]; then
				# Commit the ValBufferPre and the current character as they're nested inside the value
				ValBuffer="$ValBuffer$ValBufferPre$Char"
				ValBufferPre=''
			fi

			if [ "$Level" -le "$LevelFocus" ]; then
				if [ "$MatchPropType" = 'ValuePair' ]; then MatchPosEndOfVal=$(( DataPos - Level - 1 )) # No trailing newline
				elif [ "$MatchPropType" = 'Object' ]; then MatchPosEndOfVal=$(( DataPos - Level )) # Includes trailing newline
				fi
				HuntMode='Done'

			elif [ "$Mode" = "LsProps" ] && [ "$Level" -eq "$LevelFocusForLsProps" ]; then
				HuntMode="BaamHuntPropertyName"
				BaamHuntPropertyName
			else
				Level=0; HuntMode="BaamHuntTabSearch"
			fi
		else
			if [ "$Level" -lt "$LevelFocus" ]; then
				HuntMode='Done'
			elif [ "$Level" -gt "$LevelFocus" ]; then
				Level=0; HuntMode="BaamHuntTabSearch"
			else
				HuntMode="BaamHuntPropertyName"
				BaamHuntPropertyName
			fi
		fi
	fi
}

BaamHuntPropertyName(){
	# Gather property name
	if [ "$Char" = ":" ] || [ "$Char" = "$NewLine" ]; then
		# Property name gathered

		if [ -n "$AddrFound" ]; then
			# The address has already been found, if BaamHuntPropertyName is being called it's a nested property at the LsPropsFocusLevel
			#if [ "$Mode" = "LsProps" ]; then # Redundant?
				if [ -z "$PropListBuffer" ]; then PropListBuffer=$PropNameBuffer
				else PropListBuffer="$PropListBuffer $PropNameBuffer"
				fi
			#fi
		else

			AddrToTest="$AddrBuffer$PropNameBuffer "
			if [ "$AddrTarget" = "$AddrToTest" ]; then
				AddrFound=1
				MatchPosStartOfVal=$(( DataPos - 1 ))
				MatchPosStartOfProp=$(( DataPos - 1 - $(expr length "$PropNameBuffer") - Level ))

				[ "$Mode" = "GetVal" ] && ValBufferOn='1'
				[ "$Mode" = "LsProps" ] && LevelFocusForLsProps=$(( LevelFocus + 1 ))

				if [ "$Char" = ":" ]; then MatchPropType="ValuePair"
				elif [ "$Char" = "$NewLine" ]; then MatchPropType="Object"
				fi

			else

				AddrToTestLen=$(( $(wc --char <<< "$AddrToTest") -1 ))
				AddrTargetPart="${AddrTarget:0:$AddrToTestLen}"

				if [ "$AddrToTest" = "$AddrTargetPart" ]; then
					# >>>>>> PROP MATCH <<<<<<<
					(( LevelFocus++ ))
					AddrBuffer=$AddrToTest
				fi
			fi
		fi

		# Set state back to tab search
		HuntMode="BaamHuntTabSearch"; Level=0; PropNameBuffer=""
	else
		# Read into property name
		PropNameBuffer="${PropNameBuffer}${Char}"
	fi
}

BaamHuntReturn() {
	if [ -z "$AddrFound" ]; then
		Err 1 'AddrNotFound'
	else
		[ -z "$MatchPosEndOfVal" ] && MatchPosEndOfVal=$(wc --char <<< "$Db")
		if [ "$Mode" = 'GetVal' ]; then
			[ "$MatchPropType" = 'Object' ] && ValBuffer=$(BaamSetBase "$ValBuffer" 1)
			printf '%s' "$ValBuffer"
		elif [ "$Mode" = 'LsProps' ]; then
			printf '%s' "$PropListBuffer"
		elif [ "$Mode" = 'SetVal' ]; then
			if [ "${InsertVal:0:1}" = "	" ]; then
				InsertVal="$(BaamSetBase "$InsertVal" $(( MatchLevel + 1 )) 1)"
				printf '%s' "${Db:0:$(( MatchPosStartOfVal + 1 ))}$NewLine${InsertVal}${Db:$MatchPosEndOfVal}"
			else
				printf '%s' "${Db:0:$(( MatchPosStartOfVal + 1 ))}:${InsertVal}${Db:$MatchPosEndOfVal}"
			fi
		elif [ "$Mode" = 'RmProp' ]; then
			printf '%s' "${Db:0:$MatchPosStartOfProp}${Db:$MatchPosEndOfVal}"
			printf '%s' "$Val"
		else
			Err 1 'BadMode'
		fi
	fi
}

Examples:

Example dataset in BAAML:

MyData=`cat << 'EOF'
	SpaceX
		name:SpaceX
		links
			website:https://www.spacex.com/
			flickr:https://www.flickr.com/photos/spacex/
		founder:Elon Musk
EOF`

Chance SpaceX → Links → website to: https://en.wikipedia.org/wiki/SpaceX and output the value

MyData=$(Baam "$MyData" SetVal 'https://en.wikipedia.org/wiki/SpaceX' SpaceX links website)
echo "$MyData"
echo
echo "New website! $(Baam "$MyData" GetVal SpaceX links website)"
	SpaceX
		name:SpaceX
		links
			website:https://en.wikipedia.org/wiki/SpaceX
			flickr:https://www.flickr.com/photos/spacex/
		founder:Elon Musk

New website! https://en.wikipedia.org/wiki/SpaceX

Add a hand-written object to SpaceX → Links

NewLinkObj=`cat << 'EOF'
	dragon
		webcast:https://youtu.be/xY96v0OIcK4
		patch:https://images2.imgbox.com/ab/79/Wyc9K7fv_o.png
EOF`
MyData=$(Baam "$MyData" SetVal "$NewLinkObj" SpaceX Links)
echo "$MyData"
	SpaceX
		name:SpaceX
		links
			website:https://www.spacex.com/
			flickr:https://www.flickr.com/photos/spacex/
			dragon
				webcast:https://youtu.be/xY96v0OIcK4
				patch:https://images2.imgbox.com/ab/79/Wyc9K7fv_o.png
		founder:Elon Musk

Delete property SpaceX → Links

MyData=$(Baam "$MyData" RmProp SpaceX Links)
echo "$MyData"
	SpaceX
		name:SpaceX
		founder:Elon Musk

List every child property of SpaceX

echo $(Baam "$MyData" LsProps SpaceX)
name links founder
1 Like

Progress 15: The first step toward the elegant answer

Re-beating the 1st challenge in pure BASH

MY_DATA=`cat << 'EOF'
	SpaceX
		headquarters
			address:Rocket Road
			city:Hawthorne
			state:California
		links
			website:https://www.spacex.com/
			flickr:https://www.flickr.com/photos/spacex/
			twitter:https://twitter.com/SpaceX
			elon_twitter:https://twitter.com/elonmusk
		name:SpaceX
		founder:Elon Musk
		founded:2002
		employees:8000
EOF`

Posted here (and below) was an answer to a self-challenge that I set that I could a read multi-dimensional associate arrays in BASH in < 10 lines.

Sadly that was “BASH” in big air quotes because it was really a RegEx string generator for grep to find the result using PCRE (why I later renamed it BAAMX) and if the BAAML had newlines they had to be removed.

# If the database has newlines, old BAAM needs them removed:
MY_DATA=${MY_DATA//$'\n'/}
BAAM (){ # BAAM - [B]ASH [A]ssociative [A]rrarys in [M]ulti-dimensions
	REGEX_STR='(^|[^\t])\t{1}'$2'((?=\t{2}[^\t])|:)\K.*?'
	LEVEL=1
	for PROP_NAME in "${@:3}"; do
		REGEX_STR+='\t{'$((LEVEL++))'}'$PROP_NAME'((?=\t{'$(($LEVEL+1))'}[^\t])|:)\K.*?'
	done
	REGEX_STR+='[^\t](?=$|\t{1,'$LEVEL'}[^\t])'
	echo $(echo "$1" | grep -oPe "$REGEX_STR")
}

Here’s the solution in 9 lines of pure BASH 10 months later.

BAAM (){ # BAAM - [B]ASH [A]ssociative [A]rrarys in [M]ulti-dimensions
	DbLayer=$1; shift; Blocks=
	for Prop in "$@"; do
		Blocks=$Blocks'	'
		DbLayerNext=${DbLayer#*$Blocks$Prop[^[:alnum:]]}
		DbLayer=${DbLayerNext%%$'\n'$Blocks[^	]*}
	done
	printf '%s' "$DbLayer"
}

Both give the same result:

echo "$(BAAM "$MY_DATA" SpaceX links website)"

https://www.spacex.com/

But will it run Crysis?

No.

This speed test grabs the website of SpaceX from the dataset and puts it in a variable 50,000 times.

for value in {1..50000}; do
	Val=$(BAAM "$MY_DATA" SpaceX links website)
done

Results

Old version using PCRE:
(I removed the new lines before the loop to make things as fast as possible)

Run 1: 2m39.959s
Run 2: 2m40.426s
Run 3: 2m40.119s

New version using pure BASH:

Run 1: 41.188s
Run 2: 39.975s
Run 3: 39.628s

Conclusions

Same number of lines, less code and 5x faster at 1,250 lookups per second.

I tried exchanging the last line of the old version for the Perl solution but it increased times to 3m26s.

perl -e "'$1' =~ m/$REGEX_STR/s; print $&;"

This project is now firmly on the road to an elegant pure BASH solution.

Wow, that speed improvement is impressive.

1 Like

It’s definitely been quite a road to get here. I should be able to convert every BAAMX feature into pure BASH at this point.

Quick GIF of BAAM working it’s way through the dataset:

BAAM

Progress 16: Traversing the tabiverse

Fully featured read-only version of BAAM is now complete in pure BASH.

New features:

  • Zero knowledge traversal: The entire database can be traversed without knowing what’s inside it.
  • Tab normalization: If a returned value is an object, it’ll be tab normalized so it can be immediately used as it’s own BAAM database.
  • Tiny: 28 lines to do what BAAMX needs 34 lines to accomplish.
    • Both BaamGetVal and BaamLsProps functions are optional.
    • Baam can be called directly the same way BaamGetVal is called, the value just won’t be tab normalized if it’s an object.
  • Fast: Rediculously faster than BAAMX.

BAAM - Read-only edition:

#!/usr/bin/env bash
Baam (){ # BAAM - [B]ASH [A]ssociative [A]rrarys in [M]ulti-dimensions
	DbLayer=$1; shift; Blocks=
	for Prop in "$@"; do
		Blocks=$Blocks'	'
		DbLayerNext=${DbLayer#*$Blocks$Prop[^[:alnum:]]}
		DbLayer=${DbLayerNext%%$'\n'$Blocks[^	]*}
	done
	printf '%s' "$DbLayer"
}
BaamGetVal(){
	Val=$( Baam "$@" )
	if [ "${Val:0:1}" = "	" ]; then
		shift
		printf -v Blocks "%$#s"
		Val=${Val//$'\n'${Blocks// /	}/$'\n'}
		printf '%s' "${Val:$#}"
	else
		printf '%s' "$Val"
	fi
}
BaamLsProps(){
	printf -v Blocks "%$#s"
	Re='^'${Blocks// /	}'([^	][^:]*)'
	while IFS= read -r Line; do
		[[ $Line =~ $Re ]] && printf '%s ' "${BASH_REMATCH[1]}"
	done <<< $( Baam "$@" )
}

Example dataset:

Db=`cat <<\EOF
	SpaceX
		Name:SpaceX
		Links
			website:https://www.spacex.com/
			twitter:https://twitter.com/SpaceX
			elon_twitter:https://twitter.com/elonmusk
		Employees:8000
EOF`

Usage Showcase:

printf '%s\n' '1. Getting a simple value from BaamGetVal or Baam:'
printf '%s\n' "$(BaamGetVal "$Db" SpaceX Links website)"
printf '%s\n' "$(Baam "$Db" SpaceX Links website)"

printf '\n%s\n' '2. Getting a tab normalized object from BaamGetVal:'
printf '%s\n' "$(BaamGetVal "$Db" SpaceX Links)"

printf '\n%s\n' '3. Using a tab normalized object as it'"'"'s own database:'
NewDb=$(BaamGetVal "$Db" SpaceX Links)
printf '%s\n' "$(BaamGetVal "$NewDb" website)"

printf '\n%s\n' '4. Getting the properties of an object and counting how many properties each contain:'
for Prop in $(BaamLsProps "$Db" SpaceX); do
	printf '%s\n' "$Prop ($(BaamLsProps "$Db" SpaceX $Prop | wc -w))"
done

Output:

1. Getting a simple value from BaamGetVal or Baam:
https://www.spacex.com/
https://www.spacex.com/

2. Getting a tab normalized object from BaamGetVal:
	website:https://www.spacex.com/
	twitter:https://twitter.com/SpaceX
	elon_twitter:https://twitter.com/elonmusk

3. Using a tab normalized object as it's own database:
https://www.spacex.com/

4. Getting the properties of an object and counting how many properties each contain:
Name (0)
Links (3)
Employees (0)

Moving on from here

This hits the line between reading and writing so the next edition will have writing functions.

I may need to rewrite the database parser so it’s a RegEx generator similar to BAAMX but for a BASH test or I might go with character counting. I’ll need to play around a bit for the most elegant approach.

1 Like

Progress 17: Promises from the lost city of BASH

Development went underground since last update. BAAM’s been implemented into several of my personal scripts including a calendar app I use daily and it’s been expanded to include BAAML arrays, useful errors, programmatic editing that doesn’t pave over blank lines or #comments… and a whole lot more. But…

Why is BAAM still purely an experiment and why you should skip this section

BAAM can be used as a complete multi-dimensional solution but all of the data is in one human readable string including #comments and blank lines. It’s optimized for {read,writ}ing human config files as fast as possible using BASH but it’s several orders of magnitude slower for continual lookups in a running script compared to storing data in individual variables. You can “get away with it” for light lookups but doing real work quickly gets into seconds of runtime.

The intermediate answer I created is baamfl (BAAM Flat Language) which is a script that flatten BAAML into a format that BASH can quickly interpret into variables and back into BAAML. Which is great and all… but now we’re back to flatness the moment we leave the config file. Variable names “pretending” to be multi-dimensional works for simple lookups and edits but the overhead stacks up extremely quickly the moment you start making real use of the multi-dimensionallity. I needed a real answer.

Finding the lost city of BASH

When I started this project I didn’t know much BASH and it’s been a fascinating journey since then learning as much as I can about it’s dev culture, it’s place among languages, why and when you should use it and how deep the language really goes. It’s been like discovering a forgotten city resplendent in ambition that I thought was just a shack in the woods barely a step above Bourne Shell. It’s made all the difference in finding a solution…

…a solution with no strings attached.

I’ve been running experiments and BASH can have true variable based multi-dimensional data structures that are several orders of magnitude faster than reading BAAML and drastically faster and more flexible than working with flattened multi-dimensional variable names… while also being easy to use, save, load and convert to or from any markup language including BAAML. These are big promises but if you could see the smile on my face…

BAAM 2.0 will consist of 3 parts:

  • Simple API for manipulating/creating true multi-dimensional datasets.
  • Saving datasets to native BASH, loading is just source ./my_dataset
  • Read and write to and from human readable multidimensional markup.
1 Like