MSGID: 1:2624/306.0 58c1d8c2
REPLY: 1:379/45 97a24391
PID: timEd/2 1.10.y2k
TID: FastEcho 1.46 8024
Randall Parker wrote in a message to John Beckett:
RP> From: Randall Parker
RP> ide_dot-here_com>
RP> John Beckett wrote:
> You've had some informative replies, but they have not mentioned one vital
> point. If Unix had a 'dir /s' command, the above would still not do what
> you want because Unix has no concept of a file extension.
RP> Suppose one wants to find, say, book*.html. How hard is it to do
RP> that?
Trivially easy. There has been a lot ignored that needs to be stated here.
First, like DOS/OS2, Linux and Unix have the concept of the current directory
and the parent directory. They even use the same conventions in all four OS
enviornments, . is the current directory and .. is the parent directory. With
that stated, any command that can take a path can accept the current directory
via a . and the parent directory via the .. You can then start referencing
further directories from there.
The next thing that hasn't been stated is the concept of case sensitivity. In
Windows/OS2 it generally doesn't exist. However in Linux and Unix it is
extremely case sensitive.
In addition, Linux web servers are very tolerant in regards to what form their
file names take. They can be tagged .html, they can also be tagged .htm and
they can even be upper case to allow people to upload files from braindead
systems like Windows to a real web server.
Lastly, nobody has touched on the biggest strength of the Linux and Unix shell
systems, that of concatenating commands together using pipes and redirections.
If I wanted to find all web pages that contained the word book starting at the
current directory I would instantly use the following command:
find . -type f | grep -i book.*\.htm?
If however I wanted only those files that had the 'book' starting at the
beginning of the filename. I would change the above as follows:
find . -type f | grep -i \/book.*\.htm[l]$
Note that the backslash is the RegExp escape character to allow the search to
actually find a period or forward slash. The period is generally replaced with
zero or one occurance of any character. The * says to repeat the previous
character as many times as possible given the constraints of the rest of the
search. The [l] says it can be replaced with an l character optionally. The $
says the search term needs to end on the end of the line. The -i says to ignore
case when searching.
> The pattern book*.*
> matches only file names starting with 'book' AND
> that contain a period (".") after 'book'.
>
> As was mentioned, you need pattern book* (no period).
RP> So find is the command to use unless you have an up-to-date
RP> database for locate?
> The action "-print" is usually the default, so the equivalent of
> DOS 'dir /s book*.*' is
> find . -name "book*"
RP> This seems verbose. But the -name applies to what comes after it,
RP> right? So how could one create an alias or script that would just
RP> take a path and a string? e.g.:
RP> ff . "book*"
That depends on whether you want to ignore case and whether you want any file
that has book at the beginning of the filename or anywhere in the file name. To
be truly flexible I'd write the alias so that it depended on a regular
expression being passed in as the search term. In all of the previous, it was
implied you were looking for only files, though all of the example find
commands would have turned up all occurances of directories named book* that
were located below current dir. The -type f in the find command will limit the
search to only files.
#!/bin/bash
find $1 -type f | grep $2
>
> The '.' refers to the current directory. Use '/' to start from root.
>
> When I last investigated 'find' (a couple of years ago), I discovered that
> you should use 'locate' or better still 'slocate'. These are much faster,
> but depend on an index file being maintained.
RP> So how to update the database for locate and slocate?
Locate doesn't search as far as find does. It is limited to only those paths
that it is told to build a database for. Find has no such limits. If you have
mounted other file systems, find will happily follow them to their depths,
unless you have told it to limit itself to only a single file system.
>
> I think that 'find' is strictly case sensitive, whereas 'slocate' has an
> option for case insensitive searching.
Find piped to grep is a combinaion that is infinitely powerful.
My advice to anyone learning the linux or unix shells, is to experiment and
learn the art of concatenation of commands. It will serve you well for the rest
of your life.
Dave Calafrancesco, Team OS/2
dave{at}drakkar.org
... They got the library at Alexandria, they're not getting mine!
---
* Origin: Druid's Grove BBS - telnet:bbs.drakkar.org (1:2624/306)
SEEN-BY: 633/267 270 5030/786
@PATH: 261/1 38 123/500 379/1 633/267
|