Class CMLUtils
java.lang.Object
org.episteme.natural.chemistry.loaders.cml.util.CMLUtils
A number of miscellaneous tools. Originally devised for jumbo.sgml, now
rewritten for jumbo.xml. Use these at your peril - some will be phased out
- Since:
- 1.0
- Author:
- Silvere Martin-Michiellot, Gemini AI (Google DeepMind)
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final intlookahead for bufferedReaderstatic final StringDescription of the Fieldstatic final intcase sensitivity flags - used throughout jumbo.xmlstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final intDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic String[]a list of the first few Roman numerals (for example for chapters)static final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final intgeneral code for unset or unknown variablesstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Fieldstatic final StringDescription of the Field -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic voidaddEnumerationToVector(Vector<Object> v, Enumeration<?> e) add the elements of an Enumeration to a Vector.addToClasspath(String extraPath) adds to the classpath and resets the system propertystatic voidaddToSystemProperties(String urlString) load a file/url into the system properties,alternativeStringTokenizer(String s, char delim) tokenize the string including adjacent delimiters (for example "foo$$bar$", "$" would contain the tokens "foo", "", "bar" and "")create Hashtable with elements common to h1 and h2.create Vector with elements common to v1 and v2.AND 2 Hashtables - inefficient except for small tables.andVectors(Vector<?> a, Vector<?> b) finds elements common to 2 vectors.static voidDescription of the Methodstatic voidrecord that we have hit a program bug!!!static Stringcapitalise(String s) capitalise a String (whatever the starting case)convertFormat(Vector<String> vector, String format) converts character format within a Vector of Strings.static voidcopy one file to another (I suspect there is a better waystatic doubleDescription of the Methodstatic StringDescription of the Methodstatic FilecreateNewFile(String fileName) create new file, including making directory if required This seems to be a mess - f.createNewFile() doesn't seem to work A directory should have a trailing file.separatorstatic booleandeleteFile(File file, boolean deleteDirectory) delete a file If directory==true then file will be recursively deletedstatic Stringremove balanced quotes from ends of (trimmed) string, else no actionstatic Stringreads a stream from url and outputs it as integer values of the characters and as strings.static booleancompares two objects using equals() allows for null objects. if either object is null returns falsestatic booleanconvenience function for comparing strings using CMLUtils.CASE/IGNORECASEstatic voidError message - nothing fancy at present.static Stringdefault escape characters in an XML string (' -> ' , etc); also escape non-XML characters (for example eacute => é)static Stringescape characters in an XML string; also escape non-XML characters (for example eacute => é).static voidoutput String and flush()static voidfreeMemory(long mem) runs the garbage collector if memory drops below mem.parse comma-separated Strings Note fields can be "" (as in ,,,) and fields can be quoted "...".static FileOutputStreamgetFileOutputStream(String fileName) get an OutputStream from a file or URL.static intgetIntegerFromRoman(String roman) translate Roman Numerals up to 50 Some normalisation is performed Failure returns -1static intgetIntFromHex(String hex) Translates a Hex number to its int equivalent.static ObjectgetNewInstance(String className) gets a new instance of a class from a hashtable because normal methods are very slowstatic Stringget current directorygetRepeatedValues(Vector<?> v) returns a vector of all repeated values in v.static Stringgets suffix from filenamestatic intget the index of a String in an arraystatic intindexOfBalancedBracket(char lbrack, String s) return index of balanced bracket -1 for none.invert a Hashtable by interchanging keys and values.static booleanisAllowedFormat(String format) Gets the allowedFormat attribute of the CMLUtils classstatic booleanisRightMouseClick(MouseEvent event) a crude way of identifying a right mouse click (because I left the Java book behind)static Stringremove leading blanksstatic doubleDescription of the Methodstatic voidDescription of the Methodstatic StringmakeAbsoluteURL(String url) If a URL is relative, make it absolute against the current directory.static StringmakeDirectory(String urlString) truncate filename suffix to make a directory name (without file.separator)static voidmessage - nothing fancy at presentstatic Stringnormalise whitespace in a String (all whitespace is transformed to single spaces and the string is NOT trimmedcreate Hashtable with elements in to h1 but not h2.create Vector with elements in v1 but not v2.create Hashtable with elements in to h1 but not h2.create Vector with elements in v1 but not v2.OR 2 Hashtables - inefficient except for small tables.static StringoutputFloat(int nPlaces, int nDec, double value) format for example f8.3 this is a mess; if cannot fit, then either right-truncates or when that doesn't work, returns ****static StringoutputInteger(int nPlaces, int value) this is a messstatic StringoutputNumber(int nPlaces, int nDec, double c) as above, but trims trailing zerosparse whitespace-separated tokens interspersed with quoted strings, for example
this is "a quoted string" and 'another token' as well
parses to:
this/is/a quoted string/and/another token/as wellstatic voidDescription of the Methodstatic StringquoteConcatenate(String[] s) concatenate strings into quote-separated stringstatic byte[]reads a byte array from DataInputStream, *including* line feedsstatic byte[]readByteArray(String filename) reads a byte array from file, *including* line feedsstatic voidread a Zipfilestatic StringremoveHTML(String s) remove balanced (well-formed) markup from a string.static Stringremove trailing blanksstatic voidsetSystemProperty(String property, String value) add a property to the System ones Don't know if this is a good idea...static doubleDescription of the Methodstatic intskipWhite(BufferedReader bReader) skip white lines and end with first non-white line Leaves bReader ready to read first non-white linestatic voidsort an object array - very inefficientstatic voidsortVector(Vector<Object> v) sort a Vector - VERY crude and inefficientstatic Stringspaces(int nspace) make a String of a given number of spacesstatic String[]splits a whitespace-separated set of tokens into a String[]static Stringremove all control (non-printing) charactersstatic byte[]stripNewlines(byte[] b) strip linefeeds from a byte arraystatic Stringsubstitute certain DOS-compatible diacriticals by the Unicode value.static Stringsubstitute hex representation of character, for example =2E by char(46).static StringsubstituteString(String s, String oldSubstring, String newSubstring, int count) make substitutions in a string.static StringsubstituteStrings(String s, String[] oldSubstrings, String[] newSubstrings) make substitutions in a string.static Stringsupports XSL substringstatic StringDescription of the Methodstatic Stringreturn the first n characters of a string and add ellipses if truncatedstatic voidWarning message - nothing fancy at presentXOR 2 Hashtables inefficient except for small tables. omit
-
Field Details
-
FORMAT_ASCII
-
FORMAT_DOS
-
FORMAT_EQUALS
-
SPACE
-
TAB
-
RETURN
-
NEWLINE
-
FORMFEED
-
WHITESPACE
-
LBRAK
-
RBRAK
-
SHRIEK
-
QUOT
-
POUND
-
DOLLAR
-
PERCENT
-
CARET
-
AMP
-
STAR
-
UNDER
-
MINUS
-
PLUS
-
EQUALS
-
LCURLY
-
RCURLY
-
LSQUARE
-
RSQUARE
-
TILDE
-
HASH
-
COLON
-
SEMICOLON
-
ATSIGN
-
APOS
-
COMMA
-
PERIOD
-
SLASH
-
QUERY
-
LANGLE
-
RANGLE
-
PIPE
-
BACKSLASH
-
NONWHITEPUNC
-
PUNC
-
X_STAGO
-
X_STAGC
-
X_ETAGO
-
X_ETAGC
-
X_EMPTAGO
-
X_EMPTAGC
-
X_STARTDEF
-
X_COMMENTO
-
X_COMMENTC
-
X_ENTDEFO
-
X_ENTDEFC
-
X_PARAMENTDEFO
-
X_PARAMENTDEFC
-
X_ELEMDEFO
-
X_ELEMDEFC
-
X_PARAMENTO
-
X_PARAMENTC
-
X_GENENTO
-
X_GENENTC
-
X_DOCTYPEO
-
X_DOCTYPEE
-
UNKNOWN
public static final int UNKNOWNgeneral code for unset or unknown variables- See Also:
-
CASE
public static final int CASEcase sensitivity flags - used throughout jumbo.xml- See Also:
-
IGNORECASE
public static final int IGNORECASEDescription of the Field- See Also:
-
BR_LOOKAHEAD
public static final int BR_LOOKAHEADlookahead for bufferedReader- See Also:
-
ROMAN_NUMERALS
a list of the first few Roman numerals (for example for chapters) -
DOS
-
-
Constructor Details
-
CMLUtils
public CMLUtils()
-
-
Method Details
-
deleteFile
delete a file If directory==true then file will be recursively deleted- Parameters:
file- Description of the ParameterdeleteDirectory- Description of the Parameter- Returns:
- Description of the Return Value
-
copyFile
copy one file to another (I suspect there is a better way- Parameters:
inFile- Description of the ParameteroutFile- Description of the Parameter- Throws:
FileNotFoundException- Description of the ExceptionIOException- Description of the Exception
-
dump
reads a stream from url and outputs it as integer values of the characters and as strings. Emulates UNIX od().- Parameters:
url- Description of the Parameter- Returns:
- String tabular version of input (in 10-column chunks)
- Throws:
Exception- Description of the Exception
-
flush
output String and flush()- Parameters:
s- Description of the Parameter
-
spaces
make a String of a given number of spaces- Parameters:
nspace- Description of the Parameter- Returns:
- Description of the Return Value
-
getSuffix
-
skipWhite
skip white lines and end with first non-white line Leaves bReader ready to read first non-white line- Parameters:
bReader- Description of the Parameter- Returns:
- int number of lines skipped
- Throws:
Exception- Description of the Exception
-
truncate
-
getIntegerFromRoman
translate Roman Numerals up to 50 Some normalisation is performed Failure returns -1- Parameters:
roman- Description of the Parameter- Returns:
- The integerFromRoman value
-
setSystemProperty
-
addToSystemProperties
load a file/url into the system properties,- Parameters:
urlString- The feature to be added to the ToSystemProperties attribute- Throws:
IOException- Description of the Exception
-
getNewInstance
-
deQuote
-
rightTrim
-
leftTrim
-
indexOfBalancedBracket
return index of balanced bracket -1 for none. String MUST start with '('- Parameters:
lbrack- Description of the Parameters- Description of the Parameter- Returns:
- Description of the Return Value
-
getCommaSeparatedStrings
parse comma-separated Strings Note fields can be "" (as in ,,,) and fields can be quoted "...". If so, embedded quotes are represented as "", for example A," this is a ""B"" character",C. An unbalanced quote returns a mess- Parameters:
s- Description of the Parameter- Returns:
- Vector the vector of Strings - any error returns null
- Throws:
Exception- Description of the Exception
-
createCommaSeparatedStrings
-
alternativeStringTokenizer
tokenize the string including adjacent delimiters (for example "foo$$bar$", "$" would contain the tokens "foo", "", "bar" and "")- Parameters:
s- Description of the Parameterdelim- Description of the Parameter- Returns:
- Description of the Return Value
-
parseWhitespaceQuotedFields
parse whitespace-separated tokens interspersed with quoted strings, for example
this is "a quoted string" and 'another token' as well
parses to:
this/is/a quoted string/and/another token/as well- Parameters:
s- Description of the Parameter- Returns:
- Vector of strings (size = 0 if s is whitespace);
-
quoteConcatenate
-
split
-
indexOf
-
equals
convenience function for comparing strings using CMLUtils.CASE/IGNORECASE- Parameters:
string1- Description of the Parameterstring2- Description of the Parametersensitivity- Description of the Parameter- Returns:
- boolean true if IGNORECASE ans string1.equalsIgnoreCase(string2) or string1.equals(string2)
-
removeHTML
remove balanced (well-formed) markup from a string. Crude (that is not fully XML-compliant); Example: "This is <A HREF="foo">bar</A> and </BR> a break" goes to "This is bar and a break"- Parameters:
s- Description of the Parameter- Returns:
- Description of the Return Value
-
warning
Warning message - nothing fancy at present- Parameters:
s- Description of the Parameter
-
message
message - nothing fancy at present- Parameters:
s- Description of the Parameter
-
error
Error message - nothing fancy at present. Display in Text frame- Parameters:
s- Description of the Parameter
-
bug
record that we have hit a program bug!!!- Parameters:
s- Description of the Parameter
-
bug
Description of the Method- Parameters:
e- Description of the Parameter
-
createNewFile
create new file, including making directory if required This seems to be a mess - f.createNewFile() doesn't seem to work A directory should have a trailing file.separator- Parameters:
fileName- Description of the Parameter- Returns:
- Description of the Return Value
- Throws:
IOException- Description of the Exception
-
getPWDName
-
substituteString
public static String substituteString(String s, String oldSubstring, String newSubstring, int count) make substitutions in a string. If oldSubtrings = "A" and newSubstrings = "aa" then count occurrences of "A" in s are replaced with "aa", etc. "AAA" count=2 would be replaced by "aaaaA"- Parameters:
s- Description of the ParameteroldSubstring- Description of the ParameternewSubstring- Description of the Parametercount- Description of the Parameter- Returns:
- Description of the Return Value
-
substituteStrings
make substitutions in a string. If oldSubtrings = {"A", "BB", "C"} and newSubstrings = {"aa", "b", "zz"} then every occurrence of "A" in s is replaced with "aa", etc. "BBB" would be replaced by "bB"- Parameters:
s- Description of the ParameteroldSubstrings- Description of the ParameternewSubstrings- Description of the Parameter- Returns:
- Description of the Return Value
-
substituteDOSbyAscii
-
substituteEquals
-
isAllowedFormat
Gets the allowedFormat attribute of the CMLUtils class- Parameters:
format- Description of the Parameter- Returns:
- The allowedFormat value
-
convertFormat
converts character format within a Vector of Strings. Some formats such as '=' escaping may require lines to be joined. Original Vector is unaltered.- Parameters:
vector- Description of the Parameterformat- Description of the Parameter- Returns:
- Description of the Return Value
-
capitalise
-
toCamelCase
-
escape
escape characters in an XML string; also escape non-XML characters (for example eacute => é). If escapes==null only escape non-XML- Parameters:
s- Description of the Parameterescapes- Description of the Parameterescape1- Description of the Parameter- Returns:
- Description of the Return Value
-
escape
-
equals
-
freeMemory
public static void freeMemory(long mem) runs the garbage collector if memory drops below mem. (I use a value of 300000 - your mileage may vary). Potentially used in loops for processing input and creation of objects- Parameters:
mem- Description of the Parameter
-
addToClasspath
-
getIntFromHex
Translates a Hex number to its int equivalent. Thus "FE" translates to 254. Horrid, but I couldn't find if Java reads hex. All results are >= 0. Errors return -1- Parameters:
hex- Description of the Parameter- Returns:
- The intFromHex value
-
readByteArray
reads a byte array from file, *including* line feeds- Parameters:
filename- Description of the Parameter- Returns:
- Description of the Return Value
- Throws:
FileNotFoundException- Description of the ExceptionIOException- Description of the Exception
-
readByteArray
reads a byte array from DataInputStream, *including* line feeds- Parameters:
d- Description of the Parameter- Returns:
- Description of the Return Value
- Throws:
IOException- Description of the Exception
-
stripISOControls
-
normaliseWhitespace
-
stripNewlines
public static byte[] stripNewlines(byte[] b) strip linefeeds from a byte array- Parameters:
b- Description of the Parameter- Returns:
- Description of the Return Value
-
isRightMouseClick
a crude way of identifying a right mouse click (because I left the Java book behind)- Parameters:
event- Description of the Parameter- Returns:
- The rightMouseClick value
-
makeDirectory
-
makeAbsoluteURL
If a URL is relative, make it absolute against the current directory. If url already has a protocol, return unchanged- Parameters:
url- Description of the Parameter- Returns:
- Description of the Return Value
- Throws:
MalformedURLException- Description of the Exception
-
getFileOutputStream
get an OutputStream from a file or URL. Required (I think) because strings of the sort "file:/C:\foo\bat.txt" crash FileOutputStream, so this strips off the file:/ stuff for Windows-like stuff- Parameters:
fileName- Description of the Parameter- Returns:
- FileOutputStream a new (opened) FileOutputStream
- Throws:
FileNotFoundException- Description of the Exception
-
readZip
read a Zipfile- Parameters:
fileName- Description of the Parameter- Throws:
IOException- Description of the Exception
-
outputInteger
this is a mess- Parameters:
nPlaces- Description of the Parametervalue- Description of the Parameter- Returns:
- Description of the Return Value
- Throws:
IllegalArgumentException- Description of the Exception
-
outputFloat
public static String outputFloat(int nPlaces, int nDec, double value) throws IllegalArgumentException format for example f8.3 this is a mess; if cannot fit, then either right-truncates or when that doesn't work, returns ****- Parameters:
nPlaces- Description of the ParameternDec- Description of the Parametervalue- Description of the Parameter- Returns:
- Description of the Return Value
- Throws:
IllegalArgumentException- Description of the Exception
-
outputNumber
as above, but trims trailing zeros- Parameters:
nPlaces- Description of the ParameternDec- Description of the Parameterc- Description of the Parameter- Returns:
- Description of the Return Value
-
invert
-
andTables
public static Hashtable<Object,Object> andTables(Hashtable<Object, Object> a, Hashtable<Object, Object> b) AND 2 Hashtables - inefficient except for small tables. Finds entries with the same key and value.- Parameters:
a- Description of the Parameterb- Description of the Parameter- Returns:
- Hashtable contains only common entries. null if none
-
orTables
public static Hashtable<Object,Object> orTables(Hashtable<Object, Object> a, Hashtable<Object, Object> b) OR 2 Hashtables - inefficient except for small tables. Merges entries. if entry with the same key and different value is found, take value from first table.- Parameters:
a- Description of the Parameterb- Description of the Parameter- Returns:
- Hashtable contains all entries. null if none
-
xorTables
-
getRepeatedValues
-
andVectors
-
addEnumerationToVector
add the elements of an Enumeration to a Vector.- Parameters:
v- The feature to be added to the EnumerationToVector attributee- The feature to be added to the EnumerationToVector attribute
-
sort
sort an object array - very inefficient- Parameters:
objs- Description of the Parameter
-
printChar
public static void printChar()Description of the Method -
sortVector
-
and
-
not
-
or
-
and
public static Hashtable<Object,Object> and(Hashtable<Object, Object> h1, Hashtable<Object, Object> h2) create Hashtable with elements common to h1 and h2. The keys are taken from h1. SLOW. Comparison is done with equals()- Parameters:
h1- Description of the Parameterh2- Description of the Parameter- Returns:
- Description of the Return Value
-
not
public static Hashtable<Object,Object> not(Hashtable<Object, Object> h1, Hashtable<Object, Object> h2) create Hashtable with elements in to h1 but not h2. The keys are taken from h1 SLOW. Comparison is done with equals()- Parameters:
h1- Description of the Parameterh2- Description of the Parameter- Returns:
- Description of the Return Value
-
or
create Hashtable with elements in to h1 but not h2. The keys are taken from h1 SLOW. Comparison is done with equals()- Parameters:
h1- Description of the Parameterh2- Description of the Parameter- Returns:
- Description of the Return Value
-
sin
Description of the Method- Parameters:
fString- Description of the Parameter- Returns:
- Description of the Return Value
-
cos
Description of the Method- Parameters:
fString- Description of the Parameter- Returns:
- Description of the Return Value
-
log
Description of the Method- Parameters:
fString- Description of the Parameter- Returns:
- Description of the Return Value
-
substring
-
main
Description of the Method- Parameters:
args- Description of the Parameter
-