Column Types¶
In the tutorial, we stored scalar data types in our tables, like strings, numbers and timestamps. In practice, we often work with more complicated types that need to be converted to a data type our database supports.
Customising String Field Lengths¶
As we discussed in TEXT
or VARCHAR
, a str
field type will be
created as a VARCHAR
, which has varying maximum-lengths depending on the database engine you are using.
For cases where you know you only need to store a certain length of text, string field maximum length can be reduced
using the max_length
validation argument to Field()
:
from typing import Optional
from sqlalchemy import Text
from sqlmodel import Field, Session, SQLModel, create_engine, select
from wonderwords import RandomWord
class Villian(SQLModel, table=True):
id: Optional[int] = Field(default=None, primary_key=True)
name: str = Field(index=True)
country_code: str = Field(max_length=2)
backstory: str = Field(sa_type=Text())
👀 Full file preview
from typing import Optional
from sqlalchemy import Text
from sqlmodel import Field, Session, SQLModel, create_engine, select
from wonderwords import RandomWord
class Villian(SQLModel, table=True):
id: Optional[int] = Field(default=None, primary_key=True)
name: str = Field(index=True)
country_code: str = Field(max_length=2)
backstory: str = Field(sa_type=Text())
sqlite_file_name = "database.db"
sqlite_url = f"sqlite:///{sqlite_file_name}"
engine = create_engine(sqlite_url, echo=True)
def create_db_and_tables():
SQLModel.metadata.create_all(engine)
def generate_backstory(words: int) -> str:
return " ".join(RandomWord().random_words(words, regex=r"\S+"))
def create_villains():
villian_1 = Villian(
name="Green Gobbler", country_code="US", backstory=generate_backstory(500)
)
villian_2 = Villian(
name="Arnim Zozza", country_code="DE", backstory=generate_backstory(500)
)
villian_3 = Villian(
name="Low-key", country_code="AS", backstory=generate_backstory(500)
)
with Session(engine) as session:
session.add(villian_1)
session.add(villian_2)
session.add(villian_3)
session.commit()
def count_words(sentence: str) -> int:
return sentence.count(" ") + 1
def select_villians():
with Session(engine) as session:
statement = select(Villian).where(Villian.name == "Green Gobbler")
results = session.exec(statement)
villian_1 = results.one()
print(
"Villian 1:",
{"name": villian_1.name, "country_code": villian_1.country_code},
count_words(villian_1.backstory),
)
statement = select(Villian).where(Villian.name == "Low-key")
results = session.exec(statement)
villian_2 = results.one()
print(
"Villian 2:",
{"name": villian_2.name, "country_code": villian_2.country_code},
count_words(villian_1.backstory),
)
def main():
create_db_and_tables()
create_villains()
select_villians()
if __name__ == "__main__":
main()
Warning
Database engines behave differently when you attempt to store longer text than the character length of the VARCHAR
column. Notably:
- SQLite does not enforce the length of a
VARCHAR
. It will happily store up to 500-million characters of text. - MySQL will emit a warning, but will also truncate your text to fit the size of the
VARCHAR
. - PostgreSQL will respond with an error code, and your query will not be executed.
However if you need to store much longer strings than VARCHAR
can allow, databases provide TEXT
or CLOB
(character large object) column types. We can use these by specifying an SQLAlchemy column type to the field
with the sa_type
keyword argument:
from wonderwords import RandomWord
class Villian(SQLModel, table=True):
id: Optional[int] = Field(default=None, primary_key=True)
name: str = Field(index=True)
country_code: str = Field(max_length=2)
backstory: str = Field(sa_type=Text())
sqlite_file_name = "database.db"
sqlite_url = f"sqlite:///{sqlite_file_name}"
engine = create_engine(sqlite_url, echo=True)
def create_db_and_tables():
SQLModel.metadata.create_all(engine)
def generate_backstory(words: int) -> str:
return " ".join(RandomWord().random_words(words, regex=r"\S+"))
def create_villains():
villian_1 = Villian(
name="Green Gobbler", country_code="US", backstory=generate_backstory(500)
)
villian_2 = Villian(
name="Arnim Zozza", country_code="DE", backstory=generate_backstory(500)
)
villian_3 = Villian(
name="Low-key", country_code="AS", backstory=generate_backstory(500)
)
with Session(engine) as session:
session.add(villian_1)
session.add(villian_2)
session.add(villian_3)
session.commit()
Tip
Text
also accepts a character length argument, which databases use to optimise the storage of a particular field.
Some databases support TINYTEXT
, SMALLTEXT
, MEDIUMTEXT
and LONGTEXT
column types - ranging from 255 bytes to
4 gigabytes. If you know the maximum length of data, specifying it like Text(1000)
will automatically select the
best-suited, supported type for your database engine.
With this approach, we can use any kind of SQLAlchemy type. For example, if we were building a mapping application, we could store spatial information:
from typing import Optional
from geoalchemy2.types import Geography
from sqlmodel import Field, SQLModel, create_engine
class BusStop(SQLModel, table=True):
id: Optional[int] = Field(default=..., primary_key=True)
latlng: Geography = Field(sa_type=Geography(geometry_type="POINT", srid=4326))
sqlite_file_name = "database.db"
sqlite_url = f"sqlite:///{sqlite_file_name}"
engine = create_engine(sqlite_url, echo=True)
SQLModel.metadata.create_all(engine)
Supported Types¶
Python types are mapped to column types as so:
Python type | SQLAlchemy type | Database column types |
---|---|---|
str | String | VARCHAR |
int | Integer | INTEGER |
float | Float | FLOAT, REAL, DOUBLE |
bool | Boolean | BOOL or TINYINT |
datetime.datetime | DateTime | DATETIME, TIMESTAMP, DATE |
datetime.date | Date | DATE |
datetime.timedelta | Interval | INTERVAL, INT |
datetime.time | Time | TIME, DATETIME |
bytes | LargeBinary | BLOB, BYTEA |
Decimal | Numeric | DECIMAL, FLOAT |
enum.Enum | Enum | ENUM, VARCHAR |
uuid.UUID | GUID | UUID, CHAR(32) |
In addition, the following types are stored as VARCHAR
:
- ipaddress.IPv4Address
- ipaddress.IPv4Network
- ipaddress.IPv6Address
- ipaddress.IPv6Network
- pathlib.Path
- pydantic.EmailStr
IP Addresses¶
IP Addresses from the Python ipaddress
module are stored as text.
import ipaddress
from datetime import UTC, datetime
from pathlib import Path
from uuid import UUID, uuid4
from pydantic import EmailStr
from sqlmodel import Field, SQLModel, create_engine
class Avatar(SQLModel, table=True):
id: UUID = Field(default_factory=uuid4, primary_key=True)
source_ip_address: ipaddress.IPv4Address
upload_location: Path
uploaded_at: datetime = Field(default=datetime.now(tz=UTC))
author_email: EmailStr
Filesystem Paths¶
Paths to files and directories using the Python pathlib
module are stored as text.
import ipaddress
from datetime import UTC, datetime
from pathlib import Path
from uuid import UUID, uuid4
from pydantic import EmailStr
from sqlmodel import Field, SQLModel, create_engine
class Avatar(SQLModel, table=True):
id: UUID = Field(default_factory=uuid4, primary_key=True)
source_ip_address: ipaddress.IPv4Address
upload_location: Path
uploaded_at: datetime = Field(default=datetime.now(tz=UTC))
author_email: EmailStr
Tip
The stored value of a Path is the basic string value: str(Path('../path/to/file'))
. If you need to store the full path
ensure you call absolute()
on the path before setting it in your model.
UUIDs¶
UUIDs from the Python uuid
module are stored as UUID
types in supported databases (just PostgreSQL at the moment), otherwise as a CHAR(32)
.
import ipaddress
from datetime import UTC, datetime
from pathlib import Path
from uuid import UUID, uuid4
from pydantic import EmailStr
from sqlmodel import Field, SQLModel, create_engine
class Avatar(SQLModel, table=True):
id: UUID = Field(default_factory=uuid4, primary_key=True)
source_ip_address: ipaddress.IPv4Address
upload_location: Path
uploaded_at: datetime = Field(default=datetime.now(tz=UTC))
author_email: EmailStr
Email Addresses¶
Email addresses using Pydantic's EmailStr
type
are stored as strings.
import ipaddress
from datetime import UTC, datetime
from pathlib import Path
from uuid import UUID, uuid4
from pydantic import EmailStr
from sqlmodel import Field, SQLModel, create_engine
class Avatar(SQLModel, table=True):
id: UUID = Field(default_factory=uuid4, primary_key=True)
source_ip_address: ipaddress.IPv4Address
upload_location: Path
uploaded_at: datetime = Field(default=datetime.now(tz=UTC))
author_email: EmailStr
Custom Pydantic types¶
As SQLModel is built on Pydantic, you can use any custom type as long as it would work in a Pydantic model. However, if the type is not a subclass of a type from the table above, you will need to specify an SQLAlchemy type to use.