module System.IO.Streams.Tutorial (
-- * Introduction-- $introduction-- * Build Input Streams-- $createinput-- * Build Output Streams-- $createoutput-- * Connect Streams-- $connect-- * Transform Streams-- $transform-- * Resource and Exception Safety-- $safety-- * Pushback-- $pushback-- * Thread Safety-- $threadsafety-- * Examples-- $examples
) where
{- $introduction
The @io-streams@ package defines two \"smart handles\" for stream processing:
* 'System.IO.Streams.InputStream': a read-only smart handle
* 'System.IO.Streams.OutputStream': a write-only smart handle
The 'System.IO.Streams.InputStream' type implements all the core operations we
expect for a read-only handle. We consume values using 'read', which returns a
'Nothing' when the resource is done:
@
'System.IO.Streams.read' :: 'System.IO.Streams.InputStream' c -> 'IO' ('Maybe' c)
@
... and we can push back values using 'System.IO.Streams.unRead':
@
'System.IO.Streams.unRead' :: c -> 'System.IO.Streams.InputStream' c -> 'IO' ()
@
The 'System.IO.Streams.OutputStream' type implements the
'System.IO.Streams.write' operation which feeds it output, supplying 'Nothing'
to signal resource exhaustion:
@
'System.IO.Streams.write' :: 'Maybe' c -> 'System.IO.Streams.OutputStream' c -> 'IO' ()
@
These streams slightly resemble Haskell 'System.IO.Handle's, but support a
wider range of sources and sinks. For example, you can convert an ordinary list
to an 'System.IO.Streams.InputStream' source and interact with it using the
handle-based API:
@
ghci> import qualified System.IO.Streams as S
ghci> listHandle \<- S.'System.IO.Streams.fromList' [1, 2]
ghci> S.'System.IO.Streams.read' listHandle
Just 1
ghci> S.'System.IO.Streams.read' listHandle
Just 2
ghci> S.'System.IO.Streams.read' listHandle
Nothing
@
Additionally, IO streams come with a library of stream transformations that
preserve their handle-like API. For example, you can map a function over an
'System.IO.Streams.InputStream', which generates a new handle to the same
stream that returns transformed values:
@
ghci> oldHandle \<- S.'System.IO.Streams.List.fromList' [1, 2, 3]
ghci> newHandle \<- S.'System.IO.Streams.Combinators.mapM' (\\x -\> 'return' (x * 10)) oldHandle
ghci> S.'System.IO.Streams.read' newHandle
10
ghci> -- We can still view the stream through the old handle
ghci> S.'System.IO.Streams.read' oldHandle
2
ghci> -- ... and switch back again
ghci> S.'System.IO.Streams.read' newHandle
30
@
IO streams focus on preserving the convention of traditional handles while
offering a wider library of stream-processing utilities.
-}{- $createinput
The @io-streams@ library provides a simple interface for creating your own
'System.IO.Streams.InputStream's and 'System.IO.Streams.OutputStream's.
You can build an 'System.IO.Streams.InputStream' from any 'IO' action that
generates output, as long as it wraps results in 'Just' and uses 'Nothing' to
signal EOF:
@
'System.IO.Streams.makeInputStream' :: 'IO' ('Maybe' a) -> 'IO' ('System.IO.Streams.InputStream' a)
@
As an example, let's wrap an ordinary read-only 'System.IO.Handle' in an
'System.IO.Streams.InputStream':
@
import "Data.ByteString" ('Data.ByteString.ByteString')
import qualified "Data.ByteString" as S
import "System.IO.Streams" ('System.IO.Streams.InputStream')
import qualified "System.IO.Streams" as Streams
import "System.IO" ('System.IO.Handle', 'System.IO.hFlush')
bUFSIZ = 32752
upgradeReadOnlyHandle :: 'System.IO.Handle' -> 'IO' ('System.IO.Streams.InputStream' 'Data.ByteString.ByteString')
upgradeReadOnlyHandle h = Streams.'System.IO.Streams.makeInputStream' f
where
f = do
x <- S.'Data.ByteString.hGetSome' h bUFSIZ
'return' $! if S.'Data.ByteString.null' x then 'Nothing' else 'Just' x
@
We didn't even really need to write the @upgradeReadOnlyHandle@ function,
because "System.IO.Streams.Handle" already provides one that uses the exact
same implementation given above:
@
'System.IO.Streams.handleToInputStream' :: 'System.IO.Handle' -> 'IO' ('System.IO.Streams.InputStream' 'Data.ByteString.ByteString')
@
-}{- $createoutput
Similarly, you can build any 'System.IO.Streams.OutputStream' from an 'IO'
action that accepts input, as long as it interprets 'Just' as more input and
'Nothing' as EOF:
@
'System.IO.Streams.makeOutputStream' :: ('Maybe' a -> 'IO' ()) -> 'IO' ('System.IO.Streams.OutputStream' a)
@
A simple 'System.IO.Streams.OutputStream' might wrap 'putStrLn' for 'Data.ByteString.ByteString's:
@
import "Data.ByteString" ('Data.ByteString.ByteString')
import qualified "Data.ByteString" as S
import "System.IO.Streams" ('System.IO.Streams.OutputStream')
import qualified "System.IO.Streams" as Streams
\
writeConsole :: 'IO' ('System.IO.Streams.OutputStream' 'Data.ByteString.ByteString')
writeConsole = Streams.'System.IO.Streams.makeOutputStream' $ \\m -> case m of
'Just' bs -> S.'Data.ByteString.putStrLn' bs
'Nothing' -> 'return' ()
@
The 'Just' wraps more incoming data, whereas 'Nothing' indicates the data is
exhausted. In principle, you can feed 'System.IO.Streams.OutputStream's more
input after writing a 'Nothing' to them, but IO streams only guarantee a
well-defined behavior up to the first 'Nothing'. After receiving the first
'Nothing', an 'System.IO.Streams.OutputStream' could respond to additional
input by:
* Using the input
* Ignoring the input
* Throwing an exception
Ideally, you should adhere to well-defined behavior and ensure that after you
write a 'Nothing' to an 'System.IO.Streams.OutputStream', you don't write
anything else.
-}{- $connect
@io-streams@ provides two ways to connect an 'System.IO.Streams.InputStream'
and 'System.IO.Streams.OutputStream':
@
'System.IO.Streams.connect' :: 'System.IO.Streams.InputStream' a -> 'System.IO.Streams.OutputStream' a -> 'IO' ()
'System.IO.Streams.supply' :: 'System.IO.Streams.InputStream' a -> 'System.IO.Streams.OutputStream' a -> 'IO' ()
@
'System.IO.Streams.connect' feeds the 'System.IO.Streams.OutputStream'
exclusively with the given 'System.IO.Streams.InputStream' and passes along the
end-of-stream notification to the 'System.IO.Streams.OutputStream'.
'System.IO.Streams.supply' feeds the 'System.IO.Streams.OutputStream'
non-exclusively with the given 'System.IO.Streams.InputStream' and does not
pass along the end-of-stream notification to the
'System.IO.Streams.OutputStream'.
You can combine both 'System.IO.Streams.supply' and 'System.IO.Streams.connect'
to feed multiple 'System.IO.Streams.InputStream's into a single
'System.IO.Streams.OutputStream':
@
import qualified "System.IO.Streams" as Streams
import "System.IO" ('System.IO.IOMode'('System.IO.WriteMode'))
main = do
Streams.'System.IO.Streams.withFileAsOutput' \"out.txt\" 'System.IO.WriteMode' $ \\outStream ->
Streams.'System.IO.Streams.withFileAsInput' \"in1.txt\" $ \\inStream1 ->
Streams.'System.IO.Streams.withFileAsInput' \"in2.txt\" $ \\inStream2 ->
Streams.'System.IO.Streams.withFileAsInput' \"in3.txt\" $ \\inStream3 ->
Streams.'System.IO.Streams.supply' inStream1 outStream
Streams.'System.IO.Streams.supply' inStream2 outStream
Streams.'System.IO.Streams.connect' inStream3 outStream
@
The final 'System.IO.Streams.connect' seals the
'System.IO.Streams.OutputStream' when the final 'System.IO.Streams.InputStream'
terminates.
Keep in mind that you do not need to use 'System.IO.Streams.connect' or
'System.IO.Streams.supply' at all: @io-streams@ mainly provides them for user
convenience. You can always build your own abstractions on top of the
'System.IO.Streams.read' and 'System.IO.Streams.write' operations.
-}{- $transform
When we build or use 'IO' streams we can tap into all the stream-processing
features the @io-streams@ library provides. For example, we can decompress any
'System.IO.Streams.InputStream' of 'Data.ByteString.ByteString's:
@
import "Control.Monad" ((>=>))
import "Data.ByteString" ('Data.ByteString.ByteString')
import "System.IO" ('System.IO.Handle')
import "System.IO.Streams" ('System.IO.Streams.InputStream', 'System.IO.Streams.OutputStream')
import qualified "System.IO.Streams" as Streams
import qualified "System.IO.Streams.File" as Streams
unzipHandle :: 'System.IO.Handle' -> 'IO' ('System.IO.Streams.InputStream' 'Data.ByteString.ByteString')
unzipHandle = Streams.'System.IO.Streams.handleToInputStream' >=> Streams.'System.IO.Streams.decompress'
@
... or we can guard it against a denial-of-service attack:
@
protectHandle :: 'System.IO.Handle' -> 'IO' ('System.IO.Streams.InputStream' 'Data.ByteString.ByteString')
protectHandle =
Streams.'System.IO.Streams.handleToInputStream' >=> Streams.'System.IO.Streams.throwIfProducesMoreThan' 1000000
@
@io-streams@ provides many useful functions such as these in its standard
library and you take advantage of them by defining IO streams that wrap your
resources.
-}{- $safety
IO streams use standard Haskell idioms for resource safety. Since all
operations occur in the IO monad, you can use 'Control.Exception.catch',
'Control.Exception.bracket', or various \"@with...@\" functions to guard any
'System.IO.Streams.read' or 'System.IO.Streams.write' without any special
considerations:
@
import qualified "Data.ByteString" as S
import "System.IO"
import "System.IO.Streams" ('System.IO.Streams.InputStream', 'System.IO.Streams.OutputStream')
import qualified "System.IO.Streams" as Streams
import qualified "System.IO.Streams.File" as Streams
main =
'System.IO.withFile' \"test.txt\" 'System.IO.ReadMode' $ \\handle -> do
stream <- Streams.'System.IO.Streams.handleToInputStream' handle
mBytes <- Streams.'System.IO.Streams.read' stream
case mBytes of
'Just' bytes -> S.'Data.ByteString.putStrLn' bytes
'Nothing' -> 'System.IO.putStrLn' \"EOF\"
@
However, you can also simplify the above example by using the convenience
function 'System.IO.Streams.File.withFileAsInput' from
"System.IO.Streams.File":
@
'System.IO.Streams.withFileAsInput'
:: 'System.IO.FilePath' -> ('System.IO.Streams.InputStream' 'Data.ByteString.ByteString' -> 'IO' a) -> 'IO' a
@
-}{- $pushback
All 'System.IO.Streams.InputStream's support pushback, which simplifies many
types of operations. For example, we can 'System.IO.Streams.peek' at an
'System.IO.Streams.InputStream' by combining 'System.IO.Streams.read' and
'System.IO.Streams.unRead':
@
'System.IO.Streams.peek' :: 'System.IO.Streams.InputStream' c -> 'IO' ('Maybe' c)
'System.IO.Streams.peek' s = do
x <- Streams.'System.IO.Streams.read' s
case x of
'Nothing' -> 'return' ()
'Just' c -> Streams.'System.IO.Streams.unRead' c s
'return' x
@
... although "System.IO.Streams" already exports the above function.
'System.IO.Streams.InputStream's can customize pushback behavior to support
more sophisticated support for pushback. For example, if you protect a stream
using 'System.IO.Streams.throwIfProducesMoreThan' and
'System.IO.Streams.unRead' input, it will subtract the unread input from the
total byte count. However, these extra features will not interfere with the
basic pushback contract, given by the following law:
@
'System.IO.Streams.unRead' c stream >> 'System.IO.Streams.read' stream == 'return' ('Just' c)
@
When you build an 'System.IO.Streams.InputStream' using
'System.IO.Streams.makeInputStream', it supplies the default pushback behavior
which just saves the input for the next 'System.IO.Streams.read' call. More
advanced users can use "System.IO.Streams.Internal" to customize their own
pushback routines.
{- NOTE: The library only exports pushback API for Sources, which are a
completely internal type, so should we teach the user how to define
custom pushback or not? Maybe that belongs in some sort of separate
"advanced" tutorial for System.IO.Streams.Internal. -}
-}{- $threadsafety
IO stream operations are not thread-safe by default for performance reasons.
However, you can transform an existing IO stream into a thread-safe one using
the provided locking functions:
@
'System.IO.Streams.lockingInputStream' :: 'System.IO.Streams.InputStream' a -> 'IO' ('System.IO.Streams.InputStream' a)
'System.IO.Streams.lockingOutputStream' :: 'System.IO.Streams.OutputStream' a -> 'IO' ('System.IO.Streams.OutputStream' a)
@
These functions do not prevent access to the previous IO stream, so you must
take care to not save the reference to the previous stream.
{- NOTE: Should I give specific performance numbers or just say something
like "a slight cost to performance" for locking? -}
{- NOTE: This could use a concrete example of a race condition that a user
might encounter without this protection. -}
-}-- $examples-- The following examples show how to use the standard library to implement-- traditional command-line utilities:---- @--{-\# LANGUAGE OverloadedStrings #-}----import Control.Monad ((>=>), join)--import qualified Data.ByteString.Char8 as S--import Data.Int (Int64)--import Data.Monoid ((\<>))--import "System.IO.Streams" ('System.IO.Streams.InputStream')--import qualified "System.IO.Streams" as Streams--import System.IO--import Prelude hiding (head)----cat :: 'FilePath' -> IO ()--cat file = 'System.IO.withFile' file ReadMode $ \\h -> do-- is <- Streams.'System.IO.Streams.handleToInputStream' h-- Streams.'System.IO.Streams.connect' is Streams.'System.IO.Streams.stdout'----grep :: S.'Data.ByteString.ByteString' -> 'FilePath' -> IO ()--grep pattern file = 'System.IO.withFile' file ReadMode $ \\h -> do-- is \<- Streams.'System.IO.Streams.handleToInputStream' h >>=-- Streams.'System.IO.Streams.lines' >>=-- Streams.'System.IO.Streams.filter' (S.isInfixOf pattern)-- os <- Streams.'System.IO.Streams.unlines' Streams.'System.IO.Streams.stdout'-- Streams.'System.IO.Streams.connect' is os----data Option = Bytes | Words | Lines----len :: 'System.IO.Streams.InputStream' a -> IO Int64--len = Streams.'System.IO.Streams.fold' (\\n _ -> n + 1) 0----wc :: Option -> 'FilePath' -> IO ()--wc opt file = 'System.IO.withFile' file ReadMode $-- Streams.'System.IO.Streams.handleToInputStream' >=> count >=> print-- where-- count = case opt of-- Bytes -> \\is -> do-- (is', cnt) <- Streams.'System.IO.Streams.countInput' is-- Streams.'System.IO.Streams.skipToEof' is'-- cnt-- Words -> Streams.'System.IO.Streams.words' >=> len-- Lines -> Streams.'System.IO.STreams.lines' >=> len----nl :: 'FilePath' -> IO ()--nl file = 'System.IO.withFile' file ReadMode $ \\h -> do-- nats <- Streams.'System.IO.Streams.fromList' [1..]-- ls \<- Streams.'System.IO.Streams.handleToInputStream' h >>= Streams.'System.IO.Streams.lines'-- is <- Streams.'System.IO.Streams.zipWith'-- (\\n bs -> S.pack (show n) \<> \" \" \<> bs)-- nats-- ls-- os <- Streams.'System.IO.Streams.unlines' Streams.'System.IO.Streams.stdout'-- Streams.'System.IO.Streams.connect' is os----head :: Int64 -> 'FilePath' -> IO ()--head n file = 'System.IO.withFile' file ReadMode $ \\h -> do-- is \<- Streams.'System.IO.Streams.handleToInputStream' h >>= Streams.'System.IO.Streams.lines' >>= Streams.'System.IO.Streams.take' n-- os <- Streams.'System.IO.Streams.unlines' Streams.'System.IO.Streams.stdout'-- Streams.'System.IO.Streams.connect' is os----paste :: 'FilePath' -> 'FilePath' -> IO ()--paste file1 file2 =-- 'System.IO.withFile' file1 ReadMode $ \\h1 ->-- 'System.IO.withFile' file2 ReadMode $ \\h2 -> do-- is1 \<- Streams.'System.IO.Streams.handleToInputStream' h1 >>= Streams.'System.IO.Streams.lines'-- is2 \<- Streams.'System.IO.Streams.handleToInputStream' h2 >>= Streams.'System.IO.Streams.lines'-- isT \<- Streams.'System.IO.Streams.zipWith' (\\l1 l2 -> l1 \<> \"\\t\" \<> l2) is1 is2-- os <- Streams.'System.IO.Streams.unlines' Streams.'System.IO.Streams.stdout'-- Streams.connect isT os----yes :: IO ()--yes = do-- is <- Streams.fromList (repeat \"y\")-- os <- Streams.unlines Streams.stdout-- Streams.connect is os-- @